[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆246Nov 6, 2025Updated 6 months ago
Alternatives and similar repositories for MegaPairs
Users that are interested in MegaPairs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]☆644Apr 28, 2026Updated 3 weeks ago
- official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"☆42Jul 4, 2025Updated 10 months ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆275Dec 10, 2025Updated 5 months ago
- Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original c…☆48Nov 16, 2024Updated last year
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆181Jul 7, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆24Oct 16, 2025Updated 7 months ago
- [ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"☆105Dec 8, 2025Updated 5 months ago
- Not a neutral survey — a field manual for engineers who build, train, and ship multimodal retrieval at production scale. The C-L-I triang…☆78Apr 20, 2026Updated last month
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆70Jul 8, 2025Updated 10 months ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆72Dec 8, 2025Updated 5 months ago
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]☆21Aug 21, 2025Updated 8 months ago
- Empowering RAG with a versatile model-driven data interface for all-purpose applications!☆17Sep 10, 2024Updated last year
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆29Mar 26, 2025Updated last year
- Official implementation of paper "Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment"☆44Apr 10, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆179Oct 1, 2024Updated last year
- New generation of CLIP with strong fine grained discrimination capability, ICML2026 and ICML2025☆754May 8, 2026Updated last week
- Retrieval and Retrieval-augmented LLMs☆11,686Apr 22, 2026Updated 3 weeks ago
- ☆40Jan 12, 2026Updated 4 months ago
- [ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text☆419May 5, 2025Updated last year
- Collection of Composed Image Retrieval (CIR) papers.☆348May 8, 2026Updated last week
- ☆21Mar 5, 2025Updated last year
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆12Mar 27, 2025Updated last year
- ☆12Oct 3, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 使用Qwen3的Embedding和Reranker模型实现查找与精排☆21Jun 22, 2025Updated 10 months ago
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,440Mar 3, 2025Updated last year
- LLM2CLIP significantly improves already state-of-the-art CLIP models.☆664Feb 1, 2026Updated 3 months ago
- Solve Visual Understanding with Reinforced VLMs☆5,956Mar 12, 2026Updated 2 months ago
- 🔥🔥First-ever hour scale video understanding models☆622Jul 14, 2025Updated 10 months ago
- 🤗 HF Downloader (Hugging Face Downloader) 📦 A user-friendly GUI tool for downloading Hugging Face resources with enhanced connectivity…☆13Jan 5, 2025Updated last year
- ☆59Feb 27, 2025Updated last year
- [NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception☆44Mar 25, 2024Updated 2 years ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆146Jan 5, 2026Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.☆1,450Feb 11, 2026Updated 3 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆72Oct 17, 2025Updated 7 months ago
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆21May 30, 2024Updated last year
- A cli program of image retrieval using dinov2☆79Jul 8, 2023Updated 2 years ago
- ☆37Jun 9, 2025Updated 11 months ago
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆224Mar 20, 2025Updated last year
- 基于虚拟仿真环境下的自动驾驶交通标志识别第三名方案☆10Jan 11, 2020Updated 6 years ago