Category – User Experiments

Optimizing Production ML Inference for Accuracy and Cost Efficiency

Pushing the Boundaries of Cost-Effective ML Inference on Chameleon Testbed

In this blog post, we explore groundbreaking research on optimizing production ML inference systems to achieve high accuracy while minimizing costs. A collaboration between researchers from multiple institutions has resulted in the development of three adaptive systems - InfAdapter, IPA, and Sponge - that tackle the accuracy-cost trade-off in complex, real-world ML scenarios. Learn how these solutions, implemented on the Chameleon testbed, are pushing the boundaries of cost-effective ML inference and enabling more accessible and scalable ML deployment.

Connecting SLICES-RI and Chameleon

An Approach towards Portable, Reproducible Experiments

"Connecting SLICES-RI and Chameleon: An Approach towards Portable, Reproducible Experiments" explores the creation of replicable scientific experiments through 'pos', a novel tool and methodology. Integrated with Chameleon and other public research infrastructure, 'pos' allows researchers to replicate experiments reliably across shared infrastructures. This collaboration aligns with the SLICES-RI initiative to enhance research portability. Authors Henning Stubbe, Sebastian Gallenmüller, and Georg Carle also share insights into maintaining adaptability and variety in long-term research projects, crucial for advancing experimental reproducibility and collaboration.

Baleen: ML Admission & Prefetching for Flash Caches

Future-Proofing Data Storage: The Role of ML in Smart Caching Solutions

In the era of exponential data growth, the "Baleen" project introduces a groundbreaking approach to flash caching, utilizing machine learning to optimize data storage. This method intelligently decides what to store and prefetch, significantly reducing the hardware required, lowering costs, and enhancing sustainability. This blog explores the challenges of managing vast data volumes and how "Baleen" offers a novel solution, poised to revolutionize data center operations and sustainability practices.

Metis Unleashed: A New Dawn for File System Integrity

File systems are a fundamental part of computer systems, which organize and protect the files and data on assorted devices, including computers, smartphones, and enterprise servers. Due to its crucial role, vulnerabilities and bugs in the file system can lead to severe consequences such as data loss and system crashes. After decades of development, file systems have become increasingly complex, yet bugs continue to emerge. Meanwhile, many new file systems are invented to support new hardware or features, often without undergoing comprehensive testing. To address these gaps, we develop a checking framework (Metis) that can thoroughly and efficiently test file …

How to Train a GPT From Scratch

An experiment reproducing NanoGPT and lessons learned

In this user experiment blog, Akash Kundu details his experience using Chameleon to replicate a GPT from scratch on a corpus of Shakespeare texts. Using Chameleon Cloud, he highlights the ease and impact of training generative pre-trained transformers (GPTs), aiming to democratize access to this technology and highlight the importance of reproducible experiments.

Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data Centers

Revolutionizing Data Storage with Multi-Level Erasure Coding

Delve into the intricate world of Multi-Level Erasure Coding (MLEC) and its application in large-scale data centers. Meng Wang, a Ph.D. candidate at the University of Chicago, presents comprehensive design considerations and analysis of MLEC, highlighting its advantages over traditional single-level erasure coding. The blog is aimed at exploring the significant impact of MLEC on data redundancy and storage efficiency.

High School Summer Research Students at NYU Investigate Cloud and Edge Inference on Chameleon

Exploring Cloud and Edge Inference: High School Students' Journey Through Machine Learning Research with Chameleon at NYU

This blog post outlines the experience of high school students engaging in a summer research program at NYU, focusing on cloud and edge Machine Learning inference projects utilizing the Chameleon platform and associated Trovi artifacts. The authors detail their practical exploration into machine learning at the cloud and the edge, review results, and discuss the technical challenges encountered and the solutions developed.

Chameleon at SC '23

We would like to congratulate Alicia Esquivel Morel and the team for the acceptance of their paper, AutoLearn: Learning in the Edge to Cloud Continuum, to the SC '23 conference as well as two summer REU students who had posters accepted to SC.

Teaching from Edge to Cloud at the University of Missouri

This month, we're featuring an interview with Professor Prasad Calyam, a distinguished educator and researcher at the University of Missouri-Columbia.

This month, we're featuring an interview with Professor Prasad Calyam, a distinguished educator and researcher at the University of Missouri-Columbia. In the interview, he shares insights on effectively utilizing innovative tools like testbeds for teaching, offering valuable recommendations based on his own experiences.

OneDataShare: Democratizing Access to Data

We hope everybody had a lovely Juneteenth! Our User Experiment Blog is coming out slightly late this month due to the holiday but good things come to those who wait ;-). In this month’s blog we are talking with Jacob Goldverg, a student at University of Buffalo who used Chameleon to investigate how we can efficiently move large amounts of data while minimizing energy consumption.