[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆247Nov 6, 2025Updated 7 months ago
Alternatives and similar repositories for MegaPairs
Users that are interested in MegaPairs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning☆78May 23, 2025Updated last year
- This repo contains the code for "VLM2Vec" [ICLR 2025], "VLM2Vec-V2 [TMLR 2026]", and "MMEB-V3"☆660Updated this week
- official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"☆42Jul 4, 2025Updated 11 months ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆275Dec 10, 2025Updated 6 months ago
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆182Jul 7, 2025Updated 11 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆24Jun 22, 2026Updated last week
- Not a neutral survey — a field manual for engineers who build, train, and ship multimodal retrieval at production scale. The C-L-I triang…☆79Apr 20, 2026Updated 2 months ago
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆71Jul 8, 2025Updated 11 months ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆74Dec 8, 2025Updated 6 months ago
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]☆21Aug 21, 2025Updated 10 months ago
- Comprehensive benchmark for video text understanding☆29Jun 4, 2025Updated last year
- Empowering RAG with a versatile model-driven data interface for all-purpose applications!☆17Sep 10, 2024Updated last year
- Official implementation of paper "Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment"☆44Apr 10, 2025Updated last year
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆183Oct 1, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆33Mar 26, 2025Updated last year
- New generation of CLIP with strong fine grained discrimination capability, ICML2026 and ICML2025☆755Jun 16, 2026Updated 2 weeks ago
- Retrieval and Retrieval-augmented LLMs☆11,852Apr 22, 2026Updated 2 months ago
- ☆42Jan 12, 2026Updated 5 months ago
- [ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text☆424May 5, 2025Updated last year
- Collection of Composed Image Retrieval (CIR) papers.☆355Jun 8, 2026Updated 3 weeks ago
- ☆21Mar 5, 2025Updated last year
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆13Mar 27, 2025Updated last year
- ☆12Oct 3, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 使用Qwen3的Embedding和Reranker模型实现查找与精排☆23Jun 22, 2025Updated last year
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,437Mar 3, 2025Updated last year
- ☆16Aug 28, 2024Updated last year
- LLM2CLIP significantly improves already state-of-the-art CLIP models.☆671Feb 1, 2026Updated 4 months ago
- Solve Visual Understanding with Reinforced VLMs☆5,991Mar 12, 2026Updated 3 months ago
- 🔥🔥First-ever hour scale video understanding models☆626Jul 14, 2025Updated 11 months ago
- 🤗 HF Downloader (Hugging Face Downloader) 📦 A user-friendly GUI tool for downloading Hugging Face resources with enhanced connectivity…☆13Jan 5, 2025Updated last year
- ☆59Feb 27, 2025Updated last year
- mllm-npu: training multimodal large language models on Ascend NPUs☆95Aug 29, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆147Jan 5, 2026Updated 5 months ago
- [NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception☆44Mar 25, 2024Updated 2 years ago
- A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.☆1,454Feb 11, 2026Updated 4 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆21May 30, 2024Updated 2 years ago
- A cli program of image retrieval using dinov2☆79Jul 8, 2023Updated 2 years ago
- AMES: Asymmetric and Memory-Efficient Similarity☆48Aug 12, 2025Updated 10 months ago