Code release for "Improved baselines for vision-language pre-training"
☆62May 6, 2024Updated last year
Alternatives and similar repositories for clip-rocket
Users that are interested in clip-rocket are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- In this codebase we establish a benchmark for egocentric user adaptation based on Ego4d.First, we start from a population model which ha…☆15Jan 16, 2025Updated last year
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆213Aug 1, 2023Updated 2 years ago
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Sep 15, 2023Updated 2 years ago
- Removing Cost Volumes from Optical Flow Estimators (ICCV 2025 Oral)☆36Dec 2, 2025Updated 3 months ago
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆62Jun 12, 2023Updated 2 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19May 29, 2023Updated 2 years ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆18Jun 3, 2025Updated 9 months ago
- Generative model for 3D high-resolution cardiac segmentation☆13Feb 25, 2022Updated 4 years ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- ☆62Jun 16, 2023Updated 2 years ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Nov 29, 2023Updated 2 years ago
- Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"☆33Mar 15, 2024Updated 2 years ago
- CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation☆80Aug 15, 2024Updated last year
- VisualGPTScore for visio-linguistic reasoning☆27Oct 7, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This the code for the paper "On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework, IC…☆12Jul 6, 2021Updated 4 years ago
- Implementation of DropCov as described in DropCov: A Simple yet Effective Method for Improving Deep Architectures☆10Oct 15, 2022Updated 3 years ago
- ☆29Jul 23, 2025Updated 8 months ago
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆19Oct 4, 2022Updated 3 years ago
- This is the repository for "SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Recognition"☆16Oct 8, 2024Updated last year
- ☆37Oct 13, 2021Updated 4 years ago
- ☆20Apr 23, 2024Updated last year
- Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MIC…☆18Feb 12, 2025Updated last year
- Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining, WACV 2024☆13Jan 3, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code for ICCV 2023 paper ✨ "StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Mo…☆18Jan 25, 2024Updated 2 years ago
- The offical Pytorch code for "Continual Attentive Fusion for Incremental Learning in Semantic Segmentation"☆16Apr 8, 2022Updated 3 years ago
- If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions☆17Apr 4, 2024Updated last year
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- A Pytorch implemtentation of ICCV 2019 paper Face Swapping Gan (https://arxiv.org/abs/1908.05932)☆21Nov 11, 2019Updated 6 years ago
- Official Repository for "GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response"☆11Jul 8, 2024Updated last year
- PyTorch implementation of "Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning" with DDP and Apex AMP☆84Sep 16, 2020Updated 5 years ago
- The code release for "Variational Structured Attention Networks for Visual Dense Representation Learning"☆14Nov 28, 2022Updated 3 years ago
- ☆12Sep 6, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code and model definition for converting the official BYOL weights to PyTorch☆33Nov 23, 2021Updated 4 years ago
- Adversarial Robustness on In- and Out-Distribution Improves Explainability☆12Feb 10, 2022Updated 4 years ago
- ☆17Dec 13, 2023Updated 2 years ago
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Oct 17, 2024Updated last year
- 📍 Official repository of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS 2023)☆56Nov 8, 2023Updated 2 years ago
- Code release for "Semi-supervised learning made simple with self-supervised clustering"☆60Jun 13, 2023Updated 2 years ago
- ☆37Oct 7, 2023Updated 2 years ago