Category – User Experiments

Baleen: ML Admission & Prefetching for Flash Caches

Future-Proofing Data Storage: The Role of ML in Smart Caching Solutions

In the era of exponential data growth, the "Baleen" project introduces a groundbreaking approach to flash caching, utilizing machine learning to optimize data storage. This method intelligently decides what to store and prefetch, significantly reducing the hardware required, lowering costs, and enhancing sustainability. This blog explores the challenges of managing vast data volumes and how "Baleen" offers a novel solution, poised to revolutionize data center operations and sustainability practices.

Metis Unleashed: A New Dawn for File System Integrity

File systems are a fundamental part of computer systems, which organize and protect the files and data on assorted devices, including computers, smartphones, and enterprise servers. Due to its crucial role, vulnerabilities and bugs in the file system can lead to severe consequences such as data loss and system crashes. After decades of development, file systems have become increasingly complex, yet bugs continue to emerge. Meanwhile, many new file systems are invented to support new hardware or features, often without undergoing comprehensive testing. To address these gaps, we develop a checking framework (Metis) that can thoroughly and efficiently test file …

How to Train a GPT From Scratch

An experiment reproducing NanoGPT and lessons learned

In this user experiment blog, Akash Kundu details his experience using Chameleon to replicate a GPT from scratch on a corpus of Shakespeare texts. Using Chameleon Cloud, he highlights the ease and impact of training generative pre-trained transformers (GPTs), aiming to democratize access to this technology and highlight the importance of reproducible experiments.

Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data Centers

Revolutionizing Data Storage with Multi-Level Erasure Coding

Delve into the intricate world of Multi-Level Erasure Coding (MLEC) and its application in large-scale data centers. Meng Wang, a Ph.D. candidate at the University of Chicago, presents comprehensive design considerations and analysis of MLEC, highlighting its advantages over traditional single-level erasure coding. The blog is aimed at exploring the significant impact of MLEC on data redundancy and storage efficiency.

High School Summer Research Students at NYU Investigate Cloud and Edge Inference on Chameleon

Exploring Cloud and Edge Inference: High School Students' Journey Through Machine Learning Research with Chameleon at NYU

This blog post outlines the experience of high school students engaging in a summer research program at NYU, focusing on cloud and edge Machine Learning inference projects utilizing the Chameleon platform and associated Trovi artifacts. The authors detail their practical exploration into machine learning at the cloud and the edge, review results, and discuss the technical challenges encountered and the solutions developed.

Chameleon at SC '23

We would like to congratulate Alicia Esquivel Morel and the team for the acceptance of their paper, AutoLearn: Learning in the Edge to Cloud Continuum, to the SC '23 conference as well as two summer REU students who had posters accepted to SC.

Teaching from Edge to Cloud at the University of Missouri

This month, we're featuring an interview with Professor Prasad Calyam, a distinguished educator and researcher at the University of Missouri-Columbia.

This month, we're featuring an interview with Professor Prasad Calyam, a distinguished educator and researcher at the University of Missouri-Columbia. In the interview, he shares insights on effectively utilizing innovative tools like testbeds for teaching, offering valuable recommendations based on his own experiences.

OneDataShare: Democratizing Access to Data

We hope everybody had a lovely Juneteenth! Our User Experiment Blog is coming out slightly late this month due to the holiday but good things come to those who wait ;-). In this month’s blog we are talking with Jacob Goldverg, a student at University of Buffalo who used Chameleon to investigate how we can efficiently move large amounts of data while minimizing energy consumption.

Storage Research Experiment Patterns on Chameleon Cloud and Trovi

Today, two UChicago students share with us their thoughts on how to create reproducible experiments in a cost effective manner. Ray Sinurat and Yuyang (Roy) Huang talk about the experiment patterns for storage experiments they created and describe how they can serve as a basis for developing storage experiments. Best of all – they share the experiment patterns with the Chameleon community – we hope you will find them useful!