Accelerate Your Research with NVIDIA H100 GPUs on KVM@TACC

Tips and tricks for making the most of Chameleon's new GPU resources and reservation-based workflow

June 20, 2025
Cody Hammock

The wait is over—NVIDIA H100 GPUs are now available on KVM@TACC! Whether you're training large language models, running complex simulations, or pushing the boundaries of scientific computing, our new reservation-based VMs for the H100s gives you access to cutting-edge GPU acceleration in a flexible virtual environment. More flexibility with GPUs on KVM is coming soon, including multi-GPU support and fractional GPU access for smaller jobs.

Due to high demand and growing interest in these powerful accelerators across the testbed, we've focused on making our new H100s accessible to as many Chameleon users as possible. These nodes pack a lot of heat. Each node contains a Dell PowerEdge XE9640 equipped with Intel Xeon Platinum processor with 4 NVIDIA HGX H100 GPUs, 1 TB DDR5-4800 RAM, 2x 447.13 GB PCIe NVMe and 1x 3.84 TB PCIe NVMe. The nodes are connected with 1x 25 GbE Ethernet. These GPUs excel at:

Large language model training and inference
High-resolution scientific simulations
Complex deep learning workloads
Accelerated data analytics and visualization

To maximize the impact of these powerful machines, we've prioritized (for now) a virtual approach over bare metal, as virtualization makes it possible to share these resources among multiple users simultaneously to enable broad access for many GPU-hungry users. Users must reserve a flavor (similar to bare metal) before launching a GPU-enabled VM, with leases lasting up to one week. Time-limited reservations further improve availability by timeboxing usage and allowing more users access over a given period. Non-GPU-enabled VMs remain on-demand with no reservations needed for the time being. But, during the summer, we'll be extending the reservation model to the rest of KVM@TACC. This transition will help ensure fair access to all KVM resources as demand continues to grow.

While time-limited VMs represent a shift in workflow, KVM@TACC offers a number of features that make it well-suited for many common workflows (including CPU and GPU workloads) on Chameleon. Below, we dive into some of the tips and tricks for utilizing KVM and the new GPUs effectively.

Reproducibility and Infrastructure as Code

KVM virtual machines can be provisioned in minutes, making them ideal for rapid experimentation. Whether you're running benchmarks, testing new configurations, or deploying a training cluster, you can get started quickly without waiting for physical hardware.

By using automation tools like OpenStack Heat, Terraform, or Ansible, you can:

Rebuild environments from scratch between leases
Share templates with collaborators or students
Ensure results are repeatable across runs

Even if your final workload runs on bare metal, KVM is an efficient prototyping platform for developing and refining your setup first.

Persistent Storage and Snapshots

Even if your VM lasts only a week, your data doesn't have to. OpenStack Cinder volumes let you store data independently of your virtual machines. Volumes:

Persist across leases
Can be attached and detached as needed
Can be resized or snapshotted for backup and rollback

Snapshots of both VMs and volumes make it easy to pause and resume work across leases. If you need to shut down early or something goes wrong, you can pick up where you left off during your next reservation. This is especially valuable for teaching, debugging, and long-running research pipelines where reliability and repeatability are essential.

Flexible Networking

Another key resource that persists between reservations is your user-defined networking.

With OpenStack Neutron, you can:

Build custom virtual topologies
Use floating IPs and load balancers
Define security groups and firewall rules

Whether you're experimenting with cloud-native architectures, building testbeds for microservices, or teaching networking concepts, KVM's networking capabilities enable realistic, reusable environments—even within a short lease.

Ideal for Education

KVM@TACC is well-suited to classroom use. Instructors can provide students with isolated virtual machines, reproducible lab environments, and the freedom to explore and recover from mistakes. A one-week lease is typically more than sufficient for assignments or short-term projects, and snapshots or automated rebuilds make it easy to start over if needed.

This setup supports instruction across a range of subjects—from networking and systems to security and DevOps—while giving students hands-on experience with infrastructure tools they'll encounter in the field.

Getting Started with GPU-Enabled VMs

Ready to launch your first GPU-enabled virtual machine? We've made it easy to get started:

Follow our hands-on tutorial - Visit our Trovi sharing portal for an interactive walkthrough that covers:
- Making your first GPU reservation
- Launching a VM with GPU support
- Verifying GPU access and running sample workloads
- Best practices for managing your one-week lease
Dive into the documentation - For detailed technical information about the reservation system, check out the official Chameleon documentation on reservations.
Plan your workflow - Remember that leases last up to one week, so familiarize yourself with our persistence features (covered above) to ensure your work continues smoothly between reservations.

In short, the reservation-based model introduces a limited VM lifetime, but KVM@TACC continues to offer a fast, flexible, and resilient environment for research and education. Whether you're launching a one-hour benchmark or a week-long simulation, KVM provides the tools to build quickly, iterate confidently, and preserve what matters most.

Teaching Cloud Computing with Chameleon: Making Complex Concepts Accessible

How Chameleon Cloud Transforms Computer Science Education Across Europe

May 27, 2025
Massimo Canonico

Teaching cloud computing effectively requires hands-on experience, but establishing local datacenters or using commercial cloud providers presents significant barriers for students. Chameleon Cloud provides the perfect solution, offering real cloud infrastructure experience without access limitations or costs, enabling comprehensive cloud computing education across European universities.

Leveraging New and Improved Chameleon Images

Less Setup, More Science: Streamlined Images with Built-in Tools and Drivers

May 19, 2025
Paul Marshall

Tips and Tricks

What's the secret ingredient that makes our new Chameleon images so much better? From automatic SSH configuration to built-in rclone support, these aren't your ordinary cloud images. Find out what makes them special.

Faster Multimodal AI, Lower GPU Costs

HiRED: Cutting Inference Costs for Vision-Language Models Through Intelligent Token Selection

April 29, 2025
Kazi Hasan Ibn Arif

User Experiments

High-resolution Vision-Language Models (VLMs) offer impressive accuracy but come with significant computational costs—processing thousands of tokens per image can consume 5GB of GPU memory and add 15 seconds of latency. The HiRED (High-Resolution Early Dropping) framework addresses this challenge by intelligently selecting only the most informative visual tokens based on attention patterns. By keeping just 20% of tokens, researchers achieved a 4.7× throughput increase and 78% latency reduction while maintaining accuracy across vision tasks. This research, conducted on Chameleon's infrastructure using RTX 6000 and A100 GPUs, demonstrates how thoughtful optimization can make advanced AI more accessible and affordable.