Blog | Chameleon

Faster Multimodal AI, Lower GPU Costs

HiRED: Cutting Inference Costs for Vision-Language Models Through Intelligent Token Selection

April 29, 2025
Kazi Hasan Ibn Arif

User Experiments

High-resolution Vision-Language Models (VLMs) offer impressive accuracy but come with significant computational costs—processing thousands of tokens per image can consume 5GB of GPU memory and add 15 seconds of latency. The HiRED (High-Resolution Early Dropping) framework addresses this challenge by intelligently selecting only the most informative visual tokens based on attention patterns. By keeping just 20% of tokens, researchers achieved a 4.7× throughput increase and 78% latency reduction while maintaining accuracy across vision tasks. This research, conducted on Chameleon's infrastructure using RTX 6000 and A100 GPUs, demonstrates how thoughtful optimization can make advanced AI more accessible and affordable.

EditLord: Learning Code Transformation Rules for Code Editing

Making code edits more effective, robust, and transparent through explicit transformation rules

March 24, 2025
Weichen Li

User Experiments

In this interview, Weichen Li, a PhD student from the University of Chicago discusses research on improving code editing through explicit transformation rules. EditLord breaks down the code editing process into clear, step-by-step transformations, significantly enhancing editing performance, robustness, and functional correctness compared to existing methods.

Less Data, Better Results: How Active Learning Improves Workflow Anomaly Detection

Chameleon-Powered Research Shows the Path to Efficient Scientific Computing

Feb. 26, 2025

User Experiments

Scientific workflows often fail in unexpected ways, but traditional detection systems require massive amounts of training data. This groundbreaking approach generates just the right data needed to train anomaly detection models, improving accuracy while reducing resource consumption.

AutoAppendix: Towards One-Click Reproduction of Computational Artifacts

Streamlining Scientific Validation Through Automated Reproducibility Infrastructure

Jan. 27, 2025
Klaus Kraßnitzer

User Experiments

The AutoAppendix project evaluates computational artifact reproducibility across SC24 conference submissions, revealing that most researchers struggle with creating truly replicable experiments despite their importance to scientific validity. By developing one-click reproduction templates for the Chameleon Cloud platform, this research aims to transform how computational scientists share and validate their work, potentially saving countless hours of frustration for both authors and reviewers.

Minimizing Out-of-Memory Failures in Genomics Workflow Execution

Reducing Workflow Failures with Chameleon’s Scalable Research Platform

Dec. 30, 2024
Aaditya Mankar

User Experiments

Processing large-scale genomics data efficiently is a monumental task, often hindered by high costs and resource allocation challenges. This blog dives into an innovative system designed to optimize genomics workflows by minimizing out-of-memory failures—a critical bottleneck in such operations. Through a combination of scalable benchmarking tools and a failure-aware scheduler, researchers are unlocking new possibilities for resource efficiency and reliability. Leveraging insights from Chameleon, this solution paves the way for groundbreaking advancements in genomic data processing.

Empowering the Edge: Breaking Heterogeneity Barriers in Cloud-based ML Training

Optimizing Federated Learning for Heterogeneous Edge Devices

Nov. 25, 2024
Redwan Ibne Seraj Khan

User Experiments

Learn how researcher Redwan Khan uses Chameleon to develop FedCaSe, an innovative framework that tackles the challenges of distributed machine learning across diverse edge devices. This groundbreaking research demonstrates up to 29x improvement in client participation and 81x better data access efficiency, paving the way for more accessible and efficient AI systems.

If At First You Don't Succeed, Try, Try, Again...? Insights and LLM-Informed Tooling for Detecting Retry Bugs in Software Systems

Using Chameleon to Hunt Down Elusive Retry Bugs in Software Systems

Oct. 21, 2024
Bogdan-Alexandru Stoica

User Experiments

Discover how Bogdan Stoica and researchers at the University of Chicago developed Wasabi, an innovative tool that combines fault injection, static analysis, and large language models to detect and analyze retry-related bugs in complex software systems. Learn how Chameleon's bare-metal capabilities enabled precise testing environments for this fascinating research published at SOSP'24.

Power Patterns: Understanding the Energy Dynamics of I/O for Parallel Storage Configurations

Powering Through Data: Energy Insights for Parallel Storage Systems

Sept. 30, 2024
Maya Purohit

User Experiments

Learn how cutting-edge research is shedding light on the energy dynamics of I/O operations in HPC environments, potentially reshaping future storage designs.

Towards Characterizing Genomics Workload Performance at Scale

Leveraging Chameleon's Bare Metal Resources to Benchmark Genomics Workflows

Aug. 26, 2024
Martin Putra

User Experiments

Martin Putra, a 4th year PhD student at the University of Chicago, shares how he used Chameleon to build and test a scalable benchmarking tool for genomics workflows, uncovering insights that could lead to more efficient resource management for these computationally intensive tasks.

Rethinking Memory Management for Multi-Tiered Systems

Exploring Efficient Page Profiling and Migration in Large Heterogeneous Memory

July 22, 2024
Dong Li

User Experiments

Explore the cutting-edge research of Professor Dong Li from UC Merced as he tackles the challenges of managing multi-tiered memory systems. Learn how his innovative MTM (Multi-Tiered Memory Management) system optimizes page profiling and migration in large heterogeneous memory environments. Discover how Chameleon's unique hardware capabilities enabled this groundbreaking experiment, and gain insights into the future of high-performance computing memory management. This blog offers a glimpse into the complex world of computer memory hierarchies and how researchers are working to make them more efficient and accessible.

Category – User Experiments