jacobmarks / awesome-clip-papersView external linksLinks
The most impactful papers related to contrastive pretraining for multimodal models!
☆77Mar 5, 2024Updated last year
Alternatives and similar repositories for awesome-clip-papers
Users that are interested in awesome-clip-papers are comparing it to the libraries listed below
Sorting:
- Remove exact and approximate duplicates from your dataset in FiftyOne!☆18Apr 4, 2024Updated last year
- Convert datasets from Hugging Face to FiftyOne for Visualization☆11Mar 15, 2024Updated last year
- Run optical character recognition with PyTesseract from the FiftyOne App!☆11Apr 5, 2024Updated last year
- Albumentations Data Augmentation Plugin for FiftyOne!☆14Aug 22, 2024Updated last year
- [NeurIPS 2024] WATT: Weight Average Test-Time Adaptation of CLIP☆56Sep 26, 2024Updated last year
- [CVPR 23] Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!☆17May 14, 2024Updated last year
- Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.☆17Feb 5, 2024Updated 2 years ago
- ☆40Apr 8, 2024Updated last year
- Testbed for multimodal retrieval augmented generation techniques with FiftyOne, LlamaIndex, and Milvus☆21Aug 9, 2024Updated last year
- A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.☆749Dec 1, 2025Updated 2 months ago
- A simple CNN classifier example for PyTorch beginners.☆17Mar 18, 2021Updated 4 years ago
- A benchmark for testing memorization abilities of LMs☆22Oct 15, 2024Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆61Dec 10, 2024Updated last year
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆97Mar 26, 2025Updated 10 months ago
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆56Aug 27, 2025Updated 5 months ago
- My journey during 10 weeks of building FiftyOne plugins☆22Nov 12, 2023Updated 2 years ago
- [NeurIPS '24] Frustratingly easy Test-Time Adaptation of VLMs!!☆60Mar 24, 2025Updated 10 months ago
- [CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆32May 12, 2025Updated 9 months ago
- LLM2CLIP significantly improves already state-of-the-art CLIP models.☆623Feb 1, 2026Updated last week
- [Pattern Recognition 25] CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks☆462Mar 1, 2025Updated 11 months ago
- The official implementation of CMAE https://arxiv.org/abs/2207.13532 and https://ieeexplore.ieee.org/document/10330745☆115Jan 27, 2024Updated 2 years ago
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- [ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"☆33Jan 26, 2026Updated 2 weeks ago
- Efficient Multimodal Large Language Models: A Survey☆387Apr 29, 2025Updated 9 months ago
- [CVPR 2024] TeachCLIP for Text-to-Video Retrieval☆42May 7, 2025Updated 9 months ago
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- ☆30Sep 25, 2022Updated 3 years ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 2 years ago
- BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays☆46Dec 27, 2025Updated last month
- ☆35Nov 25, 2025Updated 2 months ago
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆42Nov 1, 2024Updated last year
- When do we not need larger vision models?☆412Feb 8, 2025Updated last year
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,812Nov 27, 2025Updated 2 months ago
- Run zero-shot prediction models on your data☆36Dec 19, 2024Updated last year
- A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring E…☆341Nov 6, 2025Updated 3 months ago
- A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…☆15Dec 3, 2025Updated 2 months ago
- This is an implementation code of paper "Integration of 3-Dimensional Discrete Wavelet Transform and Markov Random Field for Hyperspectra…☆10Jan 13, 2020Updated 6 years ago
- Truncate datetime objects to the specifiec level of precision, inspired by PostgreSQL's DATE_TRUNC.☆14Apr 20, 2021Updated 4 years ago
- [ACM MM '24 Poster] Official repository of paper titled "Towards Robustness Prompt Tuning with Fully Test-Time Adaptation for CLIP’s Zero…☆10Aug 6, 2024Updated last year