Using edge devices for CPU-based inference
Machine learning models are most often trained in the “cloud”, on powerful centralized servers with specialized resources (like GPU acceleration) for training machine learning models.
However, for a variety of reasons including privacy, latency, and network connectivity or bandwidth constraints, it is often preferable to use these models (i.e. do inference) at “edge” devices located wherever the input data is/where the model’s prediction is going to be used.
Many edge devices are less powerful and typically lack any special acceleration, so the inference time (the time from when the input is fed to the model, until the model outputs its prediction) may not be as fast as it would be on a cloud server - but we avoid having to send the input data to the cloud and then sending the prediction back.
In this experiment, we will use an edge device for inference in an image classification context.
This notebook assumes you already have a "lease" available for a device on the CHI@Edge testbed. Then, it will show you how to:
- launch a "container" on that device
- attach an IP address to the container, so that you can access it over SSH
- transfer files to and from the container
- use a pre-trained image classification model to do inference on the edge device
- delete the container
Consider running this together with Using cloud servers for GPU-based inference!
Materials are also available at: https://github.com/teaching-on-testbeds/edge-cpu-inference
Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.
Download ArchiveDownload an archive containing the files of this artifact.
Download with git
Clone the git repository for this artifact, and checkout the version's commit
git clone https://github.com/teaching-on-testbeds/edge-cpu-inference
# cd into the created directory
git checkout 5ed8d9c4bdb6461dc58de62707a75a1081b01100
Submit feedback through GitHub issues