Category – User Experiments

Power Patterns: Understanding the Energy Dynamics of I/O for Parallel Storage Configurations

Powering Through Data: Energy Insights for Parallel Storage Systems

Learn how cutting-edge research is shedding light on the energy dynamics of I/O operations in HPC environments, potentially reshaping future storage designs.

Towards Characterizing Genomics Workload Performance at Scale

Leveraging Chameleon's Bare Metal Resources to Benchmark Genomics Workflows

Martin Putra, a 4th year PhD student at the University of Chicago, shares how he used Chameleon to build and test a scalable benchmarking tool for genomics workflows, uncovering insights that could lead to more efficient resource management for these computationally intensive tasks.

Rethinking Memory Management for Multi-Tiered Systems

Exploring Efficient Page Profiling and Migration in Large Heterogeneous Memory

Explore the cutting-edge research of Professor Dong Li from UC Merced as he tackles the challenges of managing multi-tiered memory systems. Learn how his innovative MTM (Multi-Tiered Memory Management) system optimizes page profiling and migration in large heterogeneous memory environments. Discover how Chameleon's unique hardware capabilities enabled this groundbreaking experiment, and gain insights into the future of high-performance computing memory management. This blog offers a glimpse into the complex world of computer memory hierarchies and how researchers are working to make them more efficient and accessible.

Real-time Scheduling for Time-Sensitive Networking: A Systematic Review and Experimental Study

Optimizing Network Performance with Chameleon's Computing Power

In this study, Chuanyu Xue tackles the complex challenge of optimizing Time-Sensitive Networking (TSN) for real-world applications. Using Chameleon's powerful computing resources, he conducts a comprehensive evaluation of 17 scheduling algorithms across 38,400 problem instances. This research not only sheds light on the strengths and weaknesses of various TSN scheduling methods but also demonstrates how large-scale experimentation can drive advancements in network optimization. Readers will gain insights from Xue's journey, including key findings, implementation challenges, and valuable tips for leveraging Chameleon in their own research.

Optimizing Production ML Inference for Accuracy and Cost Efficiency

Pushing the Boundaries of Cost-Effective ML Inference on Chameleon Testbed

In this blog post, we explore groundbreaking research on optimizing production ML inference systems to achieve high accuracy while minimizing costs. A collaboration between researchers from multiple institutions has resulted in the development of three adaptive systems - InfAdapter, IPA, and Sponge - that tackle the accuracy-cost trade-off in complex, real-world ML scenarios. Learn how these solutions, implemented on the Chameleon testbed, are pushing the boundaries of cost-effective ML inference and enabling more accessible and scalable ML deployment.

Connecting SLICES-RI and Chameleon

An Approach towards Portable, Reproducible Experiments

"Connecting SLICES-RI and Chameleon: An Approach towards Portable, Reproducible Experiments" explores the creation of replicable scientific experiments through 'pos', a novel tool and methodology. Integrated with Chameleon and other public research infrastructure, 'pos' allows researchers to replicate experiments reliably across shared infrastructures. This collaboration aligns with the SLICES-RI initiative to enhance research portability. Authors Henning Stubbe, Sebastian Gallenmüller, and Georg Carle also share insights into maintaining adaptability and variety in long-term research projects, crucial for advancing experimental reproducibility and collaboration.

Baleen: ML Admission & Prefetching for Flash Caches

Future-Proofing Data Storage: The Role of ML in Smart Caching Solutions

In the era of exponential data growth, the "Baleen" project introduces a groundbreaking approach to flash caching, utilizing machine learning to optimize data storage. This method intelligently decides what to store and prefetch, significantly reducing the hardware required, lowering costs, and enhancing sustainability. This blog explores the challenges of managing vast data volumes and how "Baleen" offers a novel solution, poised to revolutionize data center operations and sustainability practices.

Metis Unleashed: A New Dawn for File System Integrity

File systems are a fundamental part of computer systems, which organize and protect the files and data on assorted devices, including computers, smartphones, and enterprise servers. Due to its crucial role, vulnerabilities and bugs in the file system can lead to severe consequences such as data loss and system crashes. After decades of development, file systems have become increasingly complex, yet bugs continue to emerge. Meanwhile, many new file systems are invented to support new hardware or features, often without undergoing comprehensive testing. To address these gaps, we develop a checking framework (Metis) that can thoroughly and efficiently test file …

How to Train a GPT From Scratch

An experiment reproducing NanoGPT and lessons learned

In this user experiment blog, Akash Kundu details his experience using Chameleon to replicate a GPT from scratch on a corpus of Shakespeare texts. Using Chameleon Cloud, he highlights the ease and impact of training generative pre-trained transformers (GPTs), aiming to democratize access to this technology and highlight the importance of reproducible experiments.

Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data Centers

Revolutionizing Data Storage with Multi-Level Erasure Coding

Delve into the intricate world of Multi-Level Erasure Coding (MLEC) and its application in large-scale data centers. Meng Wang, a Ph.D. candidate at the University of Chicago, presents comprehensive design considerations and analysis of MLEC, highlighting its advantages over traditional single-level erasure coding. The blog is aimed at exploring the significant impact of MLEC on data redundancy and storage efficiency.