Chameleon Changelog for November 2024

Dear Chameleon Users,

It has been an busy month for Chameleon. On November 18th, in Atlanta, we hosted the 5th Chameleon User Meeting on the topic of practical reproducibility featuring keynotes from Dr. Kate Keahey (Argonne) and Dr. Torsten Hoefler (ETH Zürich), and most importantly, fantastic presentations from many of our users who participated either as experiment authors or reviewers in various reproducibility initiatives and used Chameleon as part of this activity. This workshop carried over into a BoF session we hosted, which allowed us to discuss how to support reproducibility initiatives better. Also, at Supercomputing ‘24, we hosted a Chameleon Tutorial, which covered a range of topics from getting started with the testbed, to reproducible workflows, to advanced orchestration with Heat and Ansible. Chameleon was featured in several ACM Student Research Competition posters from students who worked with us this summer.

When we were not busy meeting and interacting with our users, we have been working hard on making the system work better for them. We’ve been investing in end of the year housekeeping, to ensure a smooth kick-off to the new year; this is unglamorous work, but we hope that it will make the system more robust and easier to use. If you are having any problems with Chameleon or general usability issues, this is  a great time to let us know about them via our help desk – operators are standing by. Here are some of the things that should work better now:

Improved Floating IP availability. Most users will need to use a public IP to access their node once it is provisioned (unless you are connecting over the private network via a bastion host). Chameleon sites have 2 pools of floating IPs (“floating” means that they get assigned to nodes ephemerally, effectively “floating” between nodes): ad-hoc and reservable, and often it is the case that one of these pools is fully used up by experiments while the other is not. If you get an error while trying to reserve/allocate an IP, we recommend trying to use the other pool. That said, this month, we have expanded the reservable pool of IPs at CHI@TACC, which will reduce the number of times we run out of IPs at the site. Additionally we’ve optimized our floating IP reaper, which ensures that allocated ad-hoc IPs are not idle, similar to our idle lease reaper for nodes, to prevent resource hoarding. Please keep in mind as you use Chameleon, that resources are shared between experimenters, so be mindful of others and release IPs or other resources when not in use.

Testbed improvements. This month we’ve also improved several parts of the core testbed. First, sometimes when extending or updating a lease, users would see a “wrong number of charges” error, which now is fixed. Additionally, we fixed an issue with our orchestration service, Heat, where it would not properly allow you to specify your reservation when launching an appliance. The web dashboard Horizon was also fixed, where users were unable to create object store containers via the GUI. Behind the scenes, we’ve also been improving our operator testing system, which allows us to detect Chameleon service faults and hardware issues before users encounter them. 

Trovi version bugfix. Our repository of artifacts, Trovi, allows users to upload experimental artifacts to share with other users. Trovi integrates with Chameleon, allowing users to launch and execute your artifact with one click. Trovi also allows you to version artifacts, tracking historical versions of the artifact files for reproducibility purposes. This month, we’ve fixed a bug where sometimes multiple versions would be given the same name, which meant that you couldn’t launch the specific version selected or delete an older version.

CHI@Edge device enrollment fixes. CHI@Edge is the version of Chameleon that allows you to orchestrate applications on edge hardware, like Raspberry Pis and Jetson Nanos. If you have your own hardware, you can use the Edge SDK to bring your own device (BYOD) to CHI@Edge, and share it with the Chameleon community, or limit it to your own project. This month, we’ve fixed issues where BYOD was not fully working.

Happy experimenting!


Add a comment

No comments