Alibaba-NLP/OmniSearch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Alibaba-NLP/OmniSearch)

Alibaba-NLP / OmniSearch

Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

☆429

Alternatives and similar repositories for OmniSearch

Users that are interested in OmniSearch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EvolvingLMMs-Lab / multimodal-search-r1
View on GitHub
[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal…
☆470Apr 7, 2026Updated 3 months ago
OpenBMB / VisRAG
View on GitHub
Parsing-free RAG supported by VLMs
☆975Jul 17, 2026Updated last week
Alibaba-NLP / MaskSearch
View on GitHub
Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"
☆155May 27, 2025Updated last year
mi92 / reverse-image-rag
View on GitHub
☆15Jul 8, 2024Updated 2 years ago
DataArcTech / RagVL
View on GitHub
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆92Nov 15, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
modelscope / PromptScope
View on GitHub
Enjoy easier conversations with LLM
☆46Mar 13, 2025Updated last year
edchengg / infoseek_eval
View on GitHub
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆26May 30, 2024Updated 2 years ago
Gabesarch / ICAL
View on GitHub
☆53May 11, 2025Updated last year
RUC-NLPIR / Search-o1
View on GitHub
🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]
☆1,240Nov 17, 2025Updated 8 months ago
CaraJ7 / MMSearch
View on GitHub
[ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs
☆496Apr 5, 2026Updated 3 months ago
WePOINTS / WePOINTS
View on GitHub
☆190Mar 13, 2026Updated 4 months ago
NeverMoreLCH / SearchLVLMs
View on GitHub
Repository for the NeurIPS 2024 paper "SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up…
☆25Dec 9, 2024Updated last year
mragbench / MRAG-Bench
View on GitHub
[ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
☆63Jan 22, 2025Updated last year
Alibaba-NLP / ZeroSearch
View on GitHub
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
☆1,307Aug 16, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
cnzzx / VSA
View on GitHub
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
☆128Nov 6, 2024Updated last year
LinWeizheDragon / Retrieval-Augmented-Visual-Question-Answering
View on GitHub
This is the official repository for Retrieval Augmented Visual Question Answering
☆252Dec 19, 2024Updated last year
Alibaba-NLP / CHRONOS
View on GitHub
Repo for NAACL 2025 Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"
☆297Aug 4, 2025Updated 11 months ago
si0wang / VisVM
View on GitHub
☆46Dec 30, 2024Updated last year
jun0wanan / awesome-large-multimodal-agents
View on GitHub
☆495Sep 25, 2024Updated last year
modelscope / ms-swift
View on GitHub
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL…
☆14,974Updated this week
PeterGriffinJin / Search-R1
View on GitHub
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
☆5,170Nov 13, 2025Updated 8 months ago
Alibaba-NLP / DeepResearch
View on GitHub
Tongyi Deep Research, the Leading Open-source Deep Research Agent
☆19,741Feb 27, 2026Updated 5 months ago
ATH-MaaS / Marco-o1
View on GitHub
An Open Large Reasoning Model for Real-World Solutions
☆1,537Jun 17, 2026Updated last month
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Alibaba-NLP / VRAG
View on GitHub
Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.
☆970Apr 29, 2026Updated 3 months ago
RUC-NLPIR / FlashRAG
View on GitHub
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
☆3,535Jul 19, 2026Updated last week
open-vision-language / infoseek
View on GitHub
☆78Oct 27, 2023Updated 2 years ago
FlagOpen / FlagEmbedding
View on GitHub
Retrieval and Retrieval-augmented LLMs
☆11,990Apr 22, 2026Updated 3 months ago
bytarnish / AGILE
View on GitHub
☆166Jan 21, 2025Updated last year
RUCAIBox / R1-Searcher
View on GitHub
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
☆720Aug 5, 2025Updated 11 months ago
TIGER-AI-Lab / UniIR
View on GitHub
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆183Oct 1, 2024Updated last year
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,083Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Alibaba-NLP / ViDoRAG
View on GitHub
[EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
☆669Jan 11, 2026Updated 6 months ago
OpenGVLab / InternVL
View on GitHub
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
☆10,106Sep 22, 2025Updated 10 months ago
facebookresearch / MetaEmbed
View on GitHub
[ICLR 2026 Oral] Official Implementation of the paper "MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interactio…
☆18Jul 2, 2026Updated 3 weeks ago
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
zjunlp / KnowAgent
View on GitHub
[NAACL 2025] KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
☆260Jan 29, 2025Updated last year
llm-lab-org / Multimodal-RAG-Survey
View on GitHub
A Survey on Multimodal Retrieval-Augmented Generation
☆537Feb 20, 2026Updated 5 months ago
open-compass / VLMEvalKit
View on GitHub
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
☆4,307Jul 22, 2026Updated last week