Official Code of ECCV 2022 paper MS-CLIP
☆91Jul 27, 2022Updated 3 years ago
Alternatives and similar repositories for MSCLIP
Users that are interested in MSCLIP are comparing it to the libraries listed below
Sorting:
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆116Sep 15, 2022Updated 3 years ago
- Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms☆11Nov 29, 2021Updated 4 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- ☆43Jun 15, 2021Updated 4 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆68Oct 11, 2022Updated 3 years ago
- Code for "Recognizing Scenes from Novel Viewpoints"☆29Sep 16, 2022Updated 3 years ago
- An official PyTorch implementation for CLIPPR☆30Jul 22, 2023Updated 2 years ago
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago
- ☆15Aug 25, 2020Updated 5 years ago
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆675Sep 19, 2022Updated 3 years ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Oct 27, 2023Updated 2 years ago
- The SVO-Probes Dataset for Verb Understanding☆31Jan 28, 2022Updated 4 years ago
- Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]☆89Oct 2, 2021Updated 4 years ago
- [CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding☆153Jul 13, 2024Updated last year
- ☆181Aug 20, 2022Updated 3 years ago
- This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …☆71Dec 20, 2021Updated 4 years ago
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆281Mar 25, 2023Updated 2 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 10 months ago
- SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)☆18Apr 28, 2024Updated last year
- ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining☆97Nov 2, 2022Updated 3 years ago
- Turning to Video for Transcript Sorting☆49Aug 27, 2023Updated 2 years ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Feb 21, 2022Updated 4 years ago
- ☆33Jul 28, 2022Updated 3 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆33Jul 15, 2022Updated 3 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- Source code for EMNLP 2022 paper “PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models”☆49Nov 10, 2022Updated 3 years ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Sep 21, 2023Updated 2 years ago
- [CVPR 2022] Official code for "Unified Contrastive Learning in Image-Text-Label Space"☆407Nov 10, 2023Updated 2 years ago
- 📍 Official repository of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS 2023)☆55Nov 8, 2023Updated 2 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- [CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》☆151Jun 7, 2023Updated 2 years ago
- Official Open Source code for "Scaling Language-Image Pre-training via Masking"☆427Mar 30, 2023Updated 2 years ago
- ☆22May 4, 2023Updated 2 years ago
- PyTorch code for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles (DANCE)☆23Nov 29, 2022Updated 3 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 2 years ago
- ☆55Feb 9, 2023Updated 3 years ago
- Official Code of IdealGPT☆35Oct 13, 2023Updated 2 years ago