Transferring Large Data Flows on Chameleon

Scientific research requires efficient transfer system for large data flows, including high-performance data movement in experimental cloud environments. Many different components that are involved in high-speed transfer such as the maximum TCP congestion window size, MTU, and NIC tx/rx ring sizes need to be tuned to achieve a high-speed network data transfer. Often, learning, setting up, and tuning the system and tools take the majority of the time for experimenters. Providing a optimized environment, tools for experiments, and reference performance results for researchers would assist them to be more productive.

 

We prepared a Data Transfer Node (DTN) that can be used to provide efficient network data transfer over a long fat network. Chameleon Large Data Flow Appliance is a template for users to spawn a set of DTNs in Chameleon Cloud. Each ready-to-use DTN instance is carefully tuned to improve throughput for a large data flow, such as big data transfer and streaming. A DTN in Chameleon Cloud allows users to experiment with sharing a large dataset in scientific collaboration, developing transfer application for high-speed network and simulate various network environment in the cloud.

 

A useful use case is to experiment with network throughput and disk-to-disk transfer between DTN for measuring the performance. Users can execute an automated test for network and disk to measure disk I/O rate, available network capacity and end-to-end disk transfer performance between the optimized DTN. They can also emulate various delay, jitter, and packet loss in the network to simulate a specific network environment. Another use case is to test different file transfer protocols.

 

To do such testing, simply login to one of the DTN and enter the following command.

 

cd Cham_LF_test
sudo ./set_network_delay.bash -s
./run_server.bas

 

After that, login to another DTN through the head node and enter

cd Cham_LF_test
./run_test.bash IP_of_server

 

 

This will run a series of test, benchmarking disk speed and network speed of the DTNs with 30 ms delay and 3 ms jitter. The user can reset the network emulation using

 

cd Cham_LF_test
sudo ./set_network_delay.bash -u

 

The set of tools and testers enable users to experiment with high-speed network and high-performance DTN in Chameleon Cloud. The users can build their own tool from the optimized DTN and test the performance.


 

Contribution Meta-Data:

Type of contribution: Software/Chameleon appliance

Author(s): Se-young Yu, Fei Yeh and Jim Chen

Link to code/repository:  https://www.chameleoncloud.org/appliances/50/

License/terms of use:  GPL V2

Link to documentation: https://www.chameleoncloud.org/appliances/50/docs/

Dependencies: Python3, CC-Ubuntu16.04-20180413

Support email: young.yu@northwestern.edu

Community contact: young.yu@northwestern.edu

 


Add a comment

No comments