Trovi: the Google Drive for Chameleon Experiments
- Nov. 16, 2020 by
- Jason Anderson
One important aspect of reproducibility is discoverability--what good is a packaged, reproducible experiment if nobody can find it and use it? One thinks of trees falling in forests. Consider the problem of the professor who wants to design a class for the next year, where students submit lab assignments: how do they provide these course materials such that students can work through them, ideally without fighting with library dependencies and OS problems that ultimately distract from the lab's goal? Or, consider the problem of a researcher preparing a paper for submission: how can they show the latest run of results to their advisor or colleagues? How can they encapsulate their work into a digital artifact that meets the conference's criteria? Finally, consider that you're getting started with a testbed: how do you find some useful examples of what others have done, which possibly highlight possibilities you didn't previously consider?
This is why on Chameleon we are providing new capabilities for finding, saving and sharing experiments on Chameleon, which we collectively dub Trovi (from the Esperanto trovi, meaning to find!) With Trovi you can package your experiments as a Jupyter notebook, save iterations of your experiment privately, but also share your work with others and even ultimately publish your experiment as a citable research artifact.
Here's a look at how this tool and others can help you in your day-to-day work.
Problem: How can I find an already-packaged experiment that I could ask my students to reproduce and extend?
Trovi already has several experiments publicly available for all Chameleon users. They include examples of actual experiments, "quick start" tutorials, and examples illustrating some of the testbed's more advanced capabilities (see the end of this post for a full list!) You can search for experiments with keywords and also filter for experiments that are private to you, shared with a particular project, or are publicly available. For example, as a teacher, you could share all of your class assignments under a particular project--any students who you add to your project will automatically be able to see and launch the assignments (as Jupyter notebooks and supporting documentation/files) on Chameleon.
Problem: Setting up my environment again for a new lease takes significant time and steps.
Chameleon users are increasingly using Jupyter notebooks to package their experiment setup. Experiments on Trovi typically are expressed as Jupyter notebooks, where each cell represents a step in the experiment workflow: making a reservation for resources, configuring those resources on the testbed, and then executing the experiment steps. Performing these steps within Jupyter is easy, as Chameleon CLI and Python clients are fully integrated. Here's a little preview of the OpenFlow Quick Start example, which you can launch on Trovi and try out yourself:
Using this method, setting up your experiment environment again is quite simple and easy to replicate when you come back later.
Problem: I want to package my experimental environment but I'm not sure how.
To snapshot the disk of any of your bare metal nodes, use the cc-snapshot utility, which comes built-in on all Chameleon base disk images. This is a good way to save a configuration of e.g., kernel modules, installed packages and settings. Another option is to use Jupyter notebook cells to perform these operations, as the power experiment example does. Finally, you can use Ansible, which can simplify some aspects of remote configuration, but does require a bit of familiarity with the tool--the JupyterHub example uses this approach.
Problem: My packaged experiment works well for me, but how can I show it to somebody else?
This is perhaps the most powerful feature of Trovi: you can share your packaged experiments with entire Chameleon projects or via private link, similar to how Google Drive documents can be shared. Check out the docs to learn more--one nice benefit is that any Chameleon user that has access to your experiment will be able to easily launch it within their own account.
Because a GIF is worth a thousand words, here is how easy it is to re-launch an experiment saved to Trovi:
Problem: How do I reference a packaged Chameleon experiment as a digital artifact?
Increasingly, conferences encourage attaching digital artifacts to paper submissions. This is very easy to do with an experiment packaged on Trovi! You can publish any saved version of your experiment to Zenodo, which will archive it in long-term storage and additionally assign a unique DOI suitable for citation. As an example, here's the DOI assigned to the JupyterHub example: 10.5281/zenodo.3463619
Check out the existing content available on Trovi, and feel free to share your own experiments for others to browse! Try out the Chameleon Quick Start notebooks to get started on Chameleon or the Example Experiments section to see real experiments from foundational papers for use in classes or to try to extend them.
Chameleon Quick Start:
Openflow Quick Start Example: an artifact designed to help you get started using OpenFlow on Chameleon and can be used as a base for OpenFlow experiments or advanced network appliances.
Jupyter Usage Metric Exploration: This notebook is an example data analysis notebook looking at usage patterns on Chameleon. Feel free to use as a model for your own data analysis needs.
Power Management Experiment Example: This example illustrates how to create a reproducible experiment in power management and describes tools available within the Chameleon base images, as well as the orchestration and snapshot capabilities.
Example Experiments on Chameleon:
Image Classification with AlexNet on the Stanford Dogs Dataset: a machine learning experiment packaged in Jupyter Notebook, designed to be run with tools available within Chameleon and OpenStack. The packaged notebook is available on Zenodo, making it easy to reproduce in ~1 hour and perfect to use to teach machine learning or how to use the Chameleon testbed.
Tiny-Tail Flash: Near Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs Reproduction: For this experiment, a Jupyter notebook is packaged and available from Zenodo, reproducing the Dev Tools Release experiment from this paper.
No comments