Transferring Large Data Flows on Chameleon
- Oct. 19, 2018 by
- Se-young Yu
Scientific research requires efficient transfer system for large data flows, including high-performance data movement in experimental cloud environments. Many different components that are involved in high-speed transfer such as the maximum TCP congestion window size, MTU, and NIC tx/rx ring sizes need to be tuned to achieve a high-speed network data transfer. Often, learning, setting up, and tuning the system and tools take the majority of the time for experimenters. Providing a optimized environment, tools for experiments, and reference performance results for researchers would assist them to be more productive.
We prepared a Data Transfer Node (DTN) that can be used to provide efficient network data transfer over a long fat network. Chameleon Large Data Flow Appliance is a template for users to spawn a set of DTNs in Chameleon Cloud. Each ready-to-use DTN instance is carefully tuned to improve throughput for a large data flow, such as big data transfer and streaming. A DTN in Chameleon Cloud allows users to experiment with sharing a large dataset in scientific collaboration, developing transfer application for high-speed network and simulate various network environment in the cloud.
A useful use case is to experiment with network throughput and disk-to-disk transfer between DTN for measuring the performance. Users can execute an automated test for network and disk to measure disk I/O rate, available network capacity and end-to-end disk transfer performance between the optimized DTN. They can also emulate various delay, jitter, and packet loss in the network to simulate a specific network environment. Another use case is to test different file transfer protocols.
To do such testing, simply login to one of the DTN and enter the following command.
cd Cham_LF_test
sudo ./set_network_delay.bash -s
./run_server.bas
After that, login to another DTN through the head node and enter
cd Cham_LF_test
./run_test.bash IP_of_server
This will run a series of test, benchmarking disk speed and network speed of the DTNs with 30 ms delay and 3 ms jitter. The user can reset the network emulation using
cd Cham_LF_test
sudo ./set_network_delay.bash -u
The set of tools and testers enable users to experiment with high-speed network and high-performance DTN in Chameleon Cloud. The users can build their own tool from the optimized DTN and test the performance.
Contribution Meta-Data:
Type of contribution: Software/Chameleon appliance
Author(s): Se-young Yu, Fei Yeh and Jim Chen
Link to code/repository: https://www.chameleoncloud.org/appliances/50/
License/terms of use: GPL V2
Link to documentation: https://www.chameleoncloud.org/appliances/50/docs/
Dependencies: Python3, CC-Ubuntu16.04-20180413
Support email: young.yu@northwestern.edu
Community contact: young.yu@northwestern.edu
No comments