VectorSpaceLab / MegaPairs
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
☆157Updated 3 weeks ago
Alternatives and similar repositories for MegaPairs:
Users that are interested in MegaPairs are comparing it to the libraries listed below
- Research Code for Multimodal-Cognition Team in Ant Group☆143Updated 9 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆310Updated 2 weeks ago
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆56Updated 6 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆232Updated 2 months ago
- ☆173Updated 3 months ago
- 🔥🔥First-ever hour scale video understanding models☆309Updated 2 weeks ago
- A Survey of Multimodal Retrieval-Augmented Generation☆17Updated 3 weeks ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆123Updated 6 months ago
- 【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling☆116Updated 6 months ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆54Updated 8 months ago
- R1-onevision, a visual language model capable of deep CoT reasoning.