[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.
☆285Jan 16, 2025Updated last year
Alternatives and similar repositories for Inf-CLIP
Users that are interested in Inf-CLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆52Jul 11, 2025Updated 9 months ago
- [AAAI 2025] Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.☆26Dec 30, 2024Updated last year
- [CVPR 2023] Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning☆22Jun 11, 2023Updated 2 years ago
- Fuzzy Positive Learning (CVPR2023)☆15Jul 25, 2024Updated last year
- [CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning☆125Dec 28, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆197Mar 17, 2025Updated last year
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆44Feb 27, 2025Updated last year
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆433Mar 26, 2024Updated 2 years ago
- VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs☆1,291Jan 23, 2025Updated last year
- [NeurIPS 2023] Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs☆129Nov 15, 2023Updated 2 years ago
- Frontier Multimodal Foundation Models for Image and Video Understanding☆1,136Aug 14, 2025Updated 8 months ago
- Precision Search through Multi-Style Inputs☆74Jul 30, 2025Updated 8 months ago
- 【Nature Computational Science 2025🔥】Deep peak property learning for efficient chiral molecules ECD spectra prediction☆50Jan 12, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- LLM2CLIP significantly improves already state-of-the-art CLIP models.☆645Feb 1, 2026Updated 2 months ago
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆53Apr 9, 2024Updated 2 years ago
- The official repo for the DanQing dataset.☆34Mar 25, 2026Updated 2 weeks ago
- The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025☆278May 26, 2025Updated 10 months ago
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆73Feb 9, 2026Updated 2 months ago
- [cvpr2023] implementation of out-of-candidate rectification methods☆15Feb 28, 2023Updated 3 years ago
- [ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model☆140Apr 9, 2024Updated 2 years ago
- GPT-4V(ision) as A Social Media Analysis Engine☆39Dec 20, 2024Updated last year
- LLM Reasoning Benchmark & Chain-of-Thoughts Dataset for Chemistry☆49Oct 9, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆79Oct 31, 2024Updated last year
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆236Jan 22, 2026Updated 2 months ago
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆454Aug 8, 2025Updated 8 months ago
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,832Nov 27, 2025Updated 4 months ago
- 【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".☆38Dec 5, 2024Updated last year
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Apr 14, 2025Updated last year
- [ICLR'25] PiCO: Peer Review in LLMs based on the Consistency Optimization, https://arxiv.org/pdf/2402.01830☆36Feb 16, 2025Updated last year
- [AAAI26] Next Patch Prediction☆131Jan 2, 2025Updated last year
- Unified Multi-modal IAA Baseline and Benchmark☆94Sep 27, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 11 months ago
- SEED-Voken: A Series of Powerful Visual Tokenizers☆1,002Nov 25, 2025Updated 4 months ago
- [ICLR'25] Reconstructive Visual Instruction Tuning☆134Apr 9, 2025Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆33Mar 26, 2025Updated last year
- The implementation of VectorNet. Done and Lose☆41Jun 21, 2020Updated 5 years ago
- [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding☆396Oct 7, 2024Updated last year
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆148Apr 9, 2024Updated 2 years ago