Re: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
This project is part of the UCSC OSPO summer of reproducibility fellowship and aims to create an interactive notebook that can be used to teach undergraduate or graduate students different levels of reproducibility in computer vision research.
The project is based on the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Dosovitskiy et al., which introduces a novel way of applying the transformer architecture, which was originally designed for natural language processing, to image recognition tasks. The paper shows that transformers can achieve state-of-the-art results on several image classification benchmarks, such as ImageNet, when trained on large-scale datasets.
Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.
Download ArchiveDownload an archive containing the files of this artifact.
Download with git
Clone the git repository for this artifact, and checkout the version's commit
git clone https://github.com/mohammed183/re_vit
# cd into the created directory
git checkout d17897de3ee0ca27790d9d3f6682c4a63ae6fcf7
Submit feedback through GitHub issues