Category – User Experiments

Empowering the Edge: Breaking Heterogeneity Barriers in Cloud-based ML Training

Optimizing Federated Learning for Heterogeneous Edge Devices

Learn how researcher Redwan Khan uses Chameleon to develop FedCaSe, an innovative framework that tackles the challenges of distributed machine learning across diverse edge devices. This groundbreaking research demonstrates up to 29x improvement in client participation and 81x better data access efficiency, paving the way for more accessible and efficient AI systems.

If At First You Don't Succeed, Try, Try, Again...? Insights and LLM-Informed Tooling for Detecting Retry Bugs in Software Systems

Using Chameleon to Hunt Down Elusive Retry Bugs in Software Systems

Discover how Bogdan Stoica and researchers at the University of Chicago developed Wasabi, an innovative tool that combines fault injection, static analysis, and large language models to detect and analyze retry-related bugs in complex software systems. Learn how Chameleon's bare-metal capabilities enabled precise testing environments for this fascinating research published at SOSP'24.

Power Patterns: Understanding the Energy Dynamics of I/O for Parallel Storage Configurations

Powering Through Data: Energy Insights for Parallel Storage Systems

Learn how cutting-edge research is shedding light on the energy dynamics of I/O operations in HPC environments, potentially reshaping future storage designs.

Towards Characterizing Genomics Workload Performance at Scale

Leveraging Chameleon's Bare Metal Resources to Benchmark Genomics Workflows

Martin Putra, a 4th year PhD student at the University of Chicago, shares how he used Chameleon to build and test a scalable benchmarking tool for genomics workflows, uncovering insights that could lead to more efficient resource management for these computationally intensive tasks.

Rethinking Memory Management for Multi-Tiered Systems

Exploring Efficient Page Profiling and Migration in Large Heterogeneous Memory

Explore the cutting-edge research of Professor Dong Li from UC Merced as he tackles the challenges of managing multi-tiered memory systems. Learn how his innovative MTM (Multi-Tiered Memory Management) system optimizes page profiling and migration in large heterogeneous memory environments. Discover how Chameleon's unique hardware capabilities enabled this groundbreaking experiment, and gain insights into the future of high-performance computing memory management. This blog offers a glimpse into the complex world of computer memory hierarchies and how researchers are working to make them more efficient and accessible.

Real-time Scheduling for Time-Sensitive Networking: A Systematic Review and Experimental Study

Optimizing Network Performance with Chameleon's Computing Power

In this study, Chuanyu Xue tackles the complex challenge of optimizing Time-Sensitive Networking (TSN) for real-world applications. Using Chameleon's powerful computing resources, he conducts a comprehensive evaluation of 17 scheduling algorithms across 38,400 problem instances. This research not only sheds light on the strengths and weaknesses of various TSN scheduling methods but also demonstrates how large-scale experimentation can drive advancements in network optimization. Readers will gain insights from Xue's journey, including key findings, implementation challenges, and valuable tips for leveraging Chameleon in their own research.

Optimizing Production ML Inference for Accuracy and Cost Efficiency

Pushing the Boundaries of Cost-Effective ML Inference on Chameleon Testbed

In this blog post, we explore groundbreaking research on optimizing production ML inference systems to achieve high accuracy while minimizing costs. A collaboration between researchers from multiple institutions has resulted in the development of three adaptive systems - InfAdapter, IPA, and Sponge - that tackle the accuracy-cost trade-off in complex, real-world ML scenarios. Learn how these solutions, implemented on the Chameleon testbed, are pushing the boundaries of cost-effective ML inference and enabling more accessible and scalable ML deployment.

Connecting SLICES-RI and Chameleon

An Approach towards Portable, Reproducible Experiments

"Connecting SLICES-RI and Chameleon: An Approach towards Portable, Reproducible Experiments" explores the creation of replicable scientific experiments through 'pos', a novel tool and methodology. Integrated with Chameleon and other public research infrastructure, 'pos' allows researchers to replicate experiments reliably across shared infrastructures. This collaboration aligns with the SLICES-RI initiative to enhance research portability. Authors Henning Stubbe, Sebastian Gallenmüller, and Georg Carle also share insights into maintaining adaptability and variety in long-term research projects, crucial for advancing experimental reproducibility and collaboration.

Baleen: ML Admission & Prefetching for Flash Caches

Future-Proofing Data Storage: The Role of ML in Smart Caching Solutions

In the era of exponential data growth, the "Baleen" project introduces a groundbreaking approach to flash caching, utilizing machine learning to optimize data storage. This method intelligently decides what to store and prefetch, significantly reducing the hardware required, lowering costs, and enhancing sustainability. This blog explores the challenges of managing vast data volumes and how "Baleen" offers a novel solution, poised to revolutionize data center operations and sustainability practices.

Metis Unleashed: A New Dawn for File System Integrity

File systems are a fundamental part of computer systems, which organize and protect the files and data on assorted devices, including computers, smartphones, and enterprise servers. Due to its crucial role, vulnerabilities and bugs in the file system can lead to severe consequences such as data loss and system crashes. After decades of development, file systems have become increasingly complex, yet bugs continue to emerge. Meanwhile, many new file systems are invented to support new hardware or features, often without undergoing comprehensive testing. To address these gaps, we develop a checking framework (Metis) that can thoroughly and efficiently test file …