SforAiDl / vformer
A modular PyTorch library for vision transformer models
☆163Updated last year
Related projects ⓘ
Alternatives and complementary repositories for vformer
- Pytorch implementation of LOST unsupervised object discovery method☆237Updated last year
- Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)☆449Updated 2 years ago
- Probing the representations of Vision Transformers.☆316Updated 2 years ago
- Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification☆130Updated 3 years ago
- Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks, de…☆97Updated 2 years ago
- EsViT: Efficient self-supervised Vision Transformers☆408Updated last year
- Code to reproduce the results in the FAIR research papers "Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting V…☆487Updated last year
- VICRegL official code base☆224Updated last year
- ☆178Updated last year
- [NeurIPS 2022] Official PyTorch implementation of Optimizing Relevance Maps of Vision Transformers Improves Robustness. This code allows …☆127Updated last year
- [CVPR 2022] Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization☆230Updated last year
- Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks.☆318Updated last year
- (ICCV 2021 Oral) CoaT: Co-Scale Conv-Attentional Image Transformers☆228Updated 2 years ago
- NeurIPS 2021, Official codes for "Efficient Training of Visual Transformers with Small Datasets".☆139Updated last year
- Self-Supervised Learning in PyTorch☆130Updated 8 months ago
- This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.☆149Updated 2 years ago
- Code Release for MeMViT Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition, CVPR 2022☆145Updated last year
- Easiest way of fine-tuning HuggingFace video classification models☆134Updated last year
- Official PyTorch implementation of Fully Attentional Networks☆467Updated last year
- [ICML 2023] Official PyTorch implementation of Global Context Vision Transformers☆425Updated 10 months ago
- Fine-tune Facebook's DETR (DEtection TRansformer) on Colaboratory.☆139Updated last year
- A summarization of Transformer-based architectures for CV tasks, including image classification, object detection, segmentation, and Few-…☆106Updated 2 years ago
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆219Updated 2 years ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆203Updated last year
- TF2 implementation of knowledge distillation using the "function matching" hypothesis from https://arxiv.org/abs/2106.05237.☆87Updated 3 years ago
- Implementation of Online Label Smoothing in PyTorch☆94Updated 2 years ago
- (CVPR 2022) Pytorch implementation of "Self-supervised transformers for unsupervised object discovery using normalized cut"☆300Updated last year
- A simple wrapper library for binding timm models as detectron2 backbones☆38Updated last year
- MultiMAE: Multi-modal Multi-task Masked Autoencoders, ECCV 2022☆548Updated last year
- Documentation for Ross Wightman's timm image model library☆292Updated 7 months ago