[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆245Nov 6, 2025Updated 4 months ago
Alternatives and similar repositories for MegaPairs
Users that are interested in MegaPairs are comparing it to the libraries listed below
Sorting:
- LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning☆77May 23, 2025Updated 9 months ago
- official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"☆42Jul 4, 2025Updated 8 months ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆275Dec 10, 2025Updated 3 months ago
- Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original c…☆47Nov 16, 2024Updated last year
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆179Jul 7, 2025Updated 8 months ago
- ☆24Oct 16, 2025Updated 5 months ago
- [ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"☆103Dec 8, 2025Updated 3 months ago
- Toward Universal Multimodal Embedding☆75Aug 1, 2025Updated 7 months ago
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆66Jul 8, 2025Updated 8 months ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆68Dec 8, 2025Updated 3 months ago
- Comprehensive benchmark for video text understanding☆28Jun 4, 2025Updated 9 months ago
- New generation of CLIP with strong fine grained discrimination capability, ICML2025☆556Oct 27, 2025Updated 4 months ago
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆28Mar 26, 2025Updated 11 months ago
- Official implementation of paper "Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment"☆44Apr 10, 2025Updated 11 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆178Oct 1, 2024Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 11 months ago
- Retrieval and Retrieval-augmented LLMs☆11,410Mar 10, 2026Updated last week
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 11 months ago
- [ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text☆415May 5, 2025Updated 10 months ago
- ☆39Jan 12, 2026Updated 2 months ago
- ☆19Mar 5, 2025Updated last year
- ☆12Oct 3, 2023Updated 2 years ago
- 使用Qwen3的Embedding和Reranker模型实现查找与精排☆21Jun 22, 2025Updated 8 months ago
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,434Mar 3, 2025Updated last year
- ☆14Aug 28, 2024Updated last year
- ☆58Feb 27, 2025Updated last year
- The simple demo of `Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval`☆13Dec 9, 2024Updated last year
- Solve Visual Understanding with Reinforced VLMs☆5,872Mar 12, 2026Updated last week
- 🔥🔥First-ever hour scale video understanding models☆615Jul 14, 2025Updated 8 months ago
- mllm-npu: training multimodal large language models on Ascend NPUs☆94Aug 29, 2024Updated last year
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆144Jan 5, 2026Updated 2 months ago
- [NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception☆44Mar 25, 2024Updated last year
- 🤗 HF Downloader (Hugging Face Downloader) 📦 A user-friendly GUI tool for downloading Hugging Face resources with enhanced connectivity…☆13Jan 5, 2025Updated last year
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 5 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆21May 30, 2024Updated last year
- A cli program of image retrieval using dinov2☆79Jul 8, 2023Updated 2 years ago
- AMES: Asymmetric and Memory-Efficient Similarity☆46Aug 12, 2025Updated 7 months ago
- ☆35Jun 9, 2025Updated 9 months ago