Code release for "Improved baselines for vision-language pre-training"
☆62May 6, 2024Updated 2 years ago
Alternatives and similar repositories for clip-rocket
Users that are interested in clip-rocket are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- In this codebase we establish a benchmark for egocentric user adaptation based on Ego4d.First, we start from a population model which ha…☆15May 18, 2026Updated 3 weeks ago
- Project for SNARE benchmark☆11Jun 5, 2024Updated 2 years ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆216Aug 1, 2023Updated 2 years ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆13Sep 30, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Sep 15, 2023Updated 2 years ago
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆63Jun 12, 2023Updated 3 years ago
- An essential implementation of BYOL in PyTorch + PyTorch Lightning☆51Jul 15, 2021Updated 4 years ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆18May 29, 2023Updated 3 years ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆19Jun 3, 2025Updated last year
- Generative model for 3D high-resolution cardiac segmentation☆13Feb 25, 2022Updated 4 years ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 3 years ago
- ☆10Jul 5, 2024Updated last year
- ☆61Jun 16, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆28Nov 29, 2023Updated 2 years ago
- Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"☆33Mar 15, 2024Updated 2 years ago
- VisualGPTScore for visio-linguistic reasoning☆27Oct 7, 2023Updated 2 years ago
- ☆30Jul 23, 2025Updated 10 months ago
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆19Oct 4, 2022Updated 3 years ago
- This is the repository for "SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Recognition"☆16Oct 8, 2024Updated last year
- ☆20Apr 23, 2024Updated 2 years ago
- Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MIC…☆18Feb 12, 2025Updated last year
- [WACV 2024] Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining, WACV 2024☆13Jan 3, 2024Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- The offical Pytorch code for "Continual Attentive Fusion for Incremental Learning in Semantic Segmentation"☆16Apr 8, 2022Updated 4 years ago
- If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions☆17Apr 4, 2024Updated 2 years ago
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Official Repository for "GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response"☆11Jul 8, 2024Updated last year
- Adversarial Robustness on In- and Out-Distribution Improves Explainability☆12Feb 10, 2022Updated 4 years ago
- SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)☆21Apr 28, 2024Updated 2 years ago
- Code release for "Semi-supervised learning made simple with self-supervised clustering"☆63Jun 13, 2023Updated 3 years ago
- This is the repository for TimelineQA, a benchmark for querying lifelogs.☆26Jul 5, 2023Updated 2 years ago
- Pytorch code for "Improving Self-Supervised Learning by Characterizing Idealized Representations"☆41Nov 27, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆37Oct 7, 2023Updated 2 years ago
- Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?☆11Jan 3, 2019Updated 7 years ago
- Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".☆16Dec 7, 2021Updated 4 years ago
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- official repo for the paper "EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata"☆53Nov 3, 2023Updated 2 years ago
- Code for ICCV2021: Discovering Human Interactions with Large-Vocabulary Objects via Query and Multi-Scale Detection☆28Oct 12, 2021Updated 4 years ago
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago