hila-chefer / RobustViT
[NeurIPS 2022] Official PyTorch implementation of Optimizing Relevance Maps of Vision Transformers Improves Robustness. This code allows to finetune the explainability maps of Vision Transformers to enhance robustness.
☆127Updated 2 years ago
Alternatives and similar repositories for RobustViT:
Users that are interested in RobustViT are comparing it to the libraries listed below
- [CVPR 2023] Learning Visual Representations via Language-Guided Sampling☆146Updated last year
- understanding model mistakes with human annotations☆106Updated 2 years ago
- ☆183Updated last year
- Code release for "Improved baselines for vision-language pre-training"☆60Updated 10 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆78Updated 2 years ago
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)☆109Updated 2 years ago
- Visual Language Transformer Interpreter - An interactive visualization tool for interpreting vision-language transformers☆88Updated last year
- Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.☆178Updated 2 years ago
- ☆50Updated 2 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆135Updated 2 years ago
- PyTorch code for MUST☆106Updated 2 years ago
- A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".☆83Updated last year
- Patching open-vocabulary models by interpolating weights☆91Updated last year
- This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …☆68Updated 3 years ago
- This is a offical PyTorch/GPU implementation of SupMAE.☆77Updated 2 years ago
- Release of ImageNet-Captions☆45Updated 2 years ago
- A task-agnostic vision-language architecture as a step towards General Purpose Vision☆92Updated 3 years ago
- PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures (CVPR 2022)☆105Updated 2 years ago
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆87Updated 7 months ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆100Updated last year
- Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)☆157Updated last year
- VICRegL official code base☆226Updated 2 years ago
- Natural Language Descriptions of Deep Visual Features, ICLR 2022☆62Updated last year
- Repository providing a wide range of self-supervised pretrained models for computer vision tasks.☆62Updated 3 years ago
- JAX implementation ViT-VQGAN☆82Updated 2 years ago
- Command-line tool for downloading and extending the RedCaps dataset.☆46Updated last year
- Generate text captions for images from their embeddings.☆104Updated last year
- CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet☆212Updated 2 years ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Updated 2 years ago
- Pytorch implementation of LOST unsupervised object discovery method☆242Updated last year