all-things-vits / code-samples
Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and Interpreting Attention in Vision.
☆184Updated last year
Alternatives and similar repositories for code-samples:
Users that are interested in code-samples are comparing it to the libraries listed below
- Open source implementation of "Vision Transformers Need Registers"☆168Updated last month
- [CVPR24] Official Implementation of GEM (Grounding Everything Module)☆113Updated 5 months ago
- Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.☆239Updated 4 months ago
- Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference☆151Updated 5 months ago
- Official PyTorch implementation of DiffuseMix : Label-Preserving Data Augmentation with Diffusion Models (CVPR'2024)☆108Updated last week
- ICCV 2023: CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No☆135Updated last year
- ☆200Updated last year
- Official Implementation for CVPR 2024 paper: CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor☆103Updated 9 months ago
- Augmenting with Language-guided Image Augmentation (ALIA)☆75Updated last year
- Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders☆103Updated 3 months ago
- CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks☆398Updated 3 weeks ago
- [AAAI'25, CVPRW 2024] Official repository of paper titled "Learning to Prompt with Text Only Supervision for Vision-Language Models".☆103Updated 3 months ago
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆219Updated 7 months ago
- official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"☆197Updated 3 months ago
- Official implementation and data release of the paper "Visual Prompting via Image Inpainting".☆307Updated last year
- Learning from synthetic data - code and models☆313Updated last year
- Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)☆115Updated 6 months ago
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆61Updated 10 months ago
- Connecting segment-anything's output masks with the CLIP model; Awesome-Segment-Anything-Works☆188Updated 5 months ago
- (ICLR 2023) Official PyTorch implementation of "What Do Self-Supervised Vision Transformers Learn?"☆106Updated last year
- [ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance☆126Updated 6 months ago
- This is the official code release for our work, Denoising Vision Transformers.☆357Updated 4 months ago
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆95Updated 2 weeks ago
- [CVPR 2023] Official repository of Generative Semantic Segmentation☆211Updated last year
- ☆184Updated last year
- 🤩 An AWESOME Curated List of Papers, Workshops, Datasets, and Challenges from CVPR 2024☆143Updated 9 months ago
- Text-Image Alignment for Diffusion-based Perception (TADP) - CVPR 2024☆30Updated 6 months ago
- Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023☆153Updated last year
- CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet☆213Updated 2 years ago
- The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"☆199Updated 11 months ago