CaraJ7/MMSearch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CaraJ7/MMSearch)

CaraJ7 / MMSearch

[ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs

☆495

Alternatives and similar repositories for MMSearch

Users that are interested in MMSearch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InternLM / MindSearch
View on GitHub
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
☆6,895Jul 4, 2025Updated last year
ULMEvalKit / ULMEvalKit
View on GitHub
ULMEvalKit: One-Stop Eval ToolKit for Image Generation
☆56Dec 17, 2025Updated 7 months ago
CaraJ7 / CoMat
View on GitHub
[NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
☆169Nov 18, 2024Updated last year
ZrrSkywalker / MathVerse
View on GitHub
[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
☆183Apr 28, 2025Updated last year
memfreeme / memfree
View on GitHub
MemFree - Hybrid AI Search Engine & AI Page Generator
☆1,506Jul 6, 2026Updated 2 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CaraJ7 / DraCo
View on GitHub
Offical Repository for Paper: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
☆17Dec 7, 2025Updated 7 months ago
EvolvingLMMs-Lab / multimodal-search-r1
View on GitHub
[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal…
☆469Apr 7, 2026Updated 3 months ago
MME-Benchmarks / MME-CoT
View on GitHub
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆136Aug 5, 2025Updated 11 months ago
Vision-CAIR / LongVU
View on GitHub
[ICML 2025] Official PyTorch implementation of LongVU
☆429May 8, 2025Updated last year
upstash / wikipedia-semantic-search
View on GitHub
Semantic Search on Wikipedia with Upstash Vector
☆470Dec 12, 2025Updated 7 months ago
TIGER-AI-Lab / ABC
View on GitHub
ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]
☆19Aug 21, 2025Updated 11 months ago
CaraJ7 / T2I-R1
View on GitHub
[NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
☆433Sep 18, 2025Updated 10 months ago
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
Alibaba-NLP / OmniSearch
View on GitHub
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
☆429Apr 22, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
CosmosShadow / gptpdf
View on GitHub
Using GPT to parse PDF
☆3,559Apr 17, 2025Updated last year
QmiAI / Qmedia
View on GitHub
An open-source AI content search engine designed specifically for content creators. Supports extraction of text, images, and short videos…
☆626Apr 9, 2026Updated 3 months ago
fatwang2 / search4all
View on GitHub
Personal AI search copilot, open-source Perplexity
☆783Aug 7, 2025Updated 11 months ago
THUDM / LongWriter
View on GitHub
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
☆1,868Jun 24, 2025Updated last year
ZrrSkywalker / MAVIS
View on GitHub
[ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models
☆156Dec 5, 2024Updated last year
thinkany-ai / rag-search
View on GitHub
RAG Search API
☆1,178Jul 29, 2024Updated last year
OcularEngineering / ocular
View on GitHub
AI Powered Search and Chat for Orgs - Think ChatGPT meets Google Search but powered by your data.
☆445Sep 3, 2024Updated last year
Yusuke710 / nanoPerplexityAI
View on GitHub
The simplest open-source implementation of perplexity.ai
☆336Jan 24, 2025Updated last year
build-with-groq / g1
View on GitHub
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
☆4,176Dec 30, 2025Updated 6 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
uvavision / SyViC
View on GitHub
[ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data
☆13Sep 30, 2023Updated 2 years ago
multimodal-art-projection / MAP-NEO
View on GitHub
☆985Feb 7, 2025Updated last year
HanSolo9682 / CounterCurate
View on GitHub
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆19Jun 27, 2024Updated 2 years ago
julien-c / arxiv-to-hf
View on GitHub
Chrome extension to add a link from each Arxiv page to the corresponding HF Paper page
☆26Jan 4, 2024Updated 2 years ago
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,155Feb 10, 2025Updated last year
InternLM / InternLM-XComposer
View on GitHub
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
☆2,921May 26, 2025Updated last year
LLaVA-VL / LLaVA-NeXT
View on GitHub
☆4,710Jun 15, 2026Updated last month
VITA-MLLM / VITA
View on GitHub
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
☆2,520Mar 28, 2025Updated last year
Theigrams / g1
View on GitHub
g1: Using GPT-4o to create o1-like reasoning chains
☆20Sep 17, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AlexZhangji / Twitter-Insight-LLM
View on GitHub
Twitter data scraping, embedding based image search and more.
☆678Apr 17, 2024Updated 2 years ago
developersdigest / llm-answer-engine
View on GitHub
Perplexity Inspired Answer Engine
☆5,031Apr 29, 2026Updated 2 months ago
lucasmrdt / TheBigPromptLibrary
View on GitHub
A collection of prompts, system prompts and LLM instructions
☆635Sep 5, 2024Updated last year
RLHF-V / RLHF-V
View on GitHub
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
☆310Sep 11, 2024Updated last year
laiso / askrepo
View on GitHub
Source code reading with LLM.
☆222May 30, 2025Updated last year
harry0703 / AudioNotes
View on GitHub
快速提取音视频内容，整理成一份结构化的markdown笔记
☆2,198Updated this week
shilinyan99 / CrossLMM
View on GitHub
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
☆25Dec 21, 2025Updated 7 months ago