ariG23498 / TokenLearner
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"
☆33Updated 3 years ago
Alternatives and similar repositories for TokenLearner:
Users that are interested in TokenLearner are comparing it to the libraries listed below
- This is a offical PyTorch/GPU implementation of SupMAE.☆77Updated 2 years ago
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆67Updated 2 years ago
- Official codes for ConMIM (ICLR 2023)☆58Updated 2 years ago
- ☆52Updated 2 years ago
- ☆16Updated last year
- HIRL: A General Framework for Hierarchical Image Representation Learning (http://arxiv.org/abs/2205.13159)☆40Updated 2 years ago
- GroupViT: Semantic Segmentation Emerges from Text Supervision☆25Updated 2 years ago
- ☆58Updated 2 years ago
- i-mae Pytorch Repo☆20Updated 11 months ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆55Updated 7 months ago
- [NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images☆58Updated 3 years ago
- ☆26Updated 3 years ago
- Clipora is a powerful toolkit for fine-tuning OpenCLIP models using Low Rank Adapters (LoRA).☆21Updated 7 months ago
- ☆50Updated 2 years ago
- ☆64Updated last year
- FastMIM, official pytorch implementation of our paper "FastMIM: Expediting Masked Image Modeling Pre-training for Vision"(https://arxiv.o…☆39Updated 2 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆88Updated 2 years ago
- [ICLR2024] Exploring Target Representations for Masked Autoencoders☆53Updated last year
- PyTorch code for MUST☆106Updated 2 years ago
- PyTorch Implementation of "Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model"☆48Updated 2 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆56Updated last year
- Code of CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping☆17Updated 2 years ago
- ☆30Updated 2 years ago
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch☆52Updated 3 years ago
- ☆8Updated 2 years ago
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆99Updated 2 years ago
- code base for vision transformers☆36Updated 3 years ago
- We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances…☆47Updated 3 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆38Updated last year
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Updated last year