MananSuri27/VisDoM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MananSuri27/VisDoM)

MananSuri27 / VisDoM

☆45

Alternatives and similar repositories for VisDoM

Users that are interested in VisDoM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WxxShirley / MoLoRAG
View on GitHub
[EMNLP 2025] Official implementation for paper "MoLoRAG: Bootstrapping Document Understanding via Multi-modal Logic-aware Retrieval"
☆27Mar 17, 2026Updated 4 months ago
dengc2023 / LongDocURL
View on GitHub
☆42Apr 6, 2026Updated 3 months ago
MRAMG-Bench / MRAMG
View on GitHub
[SIGIR 2025] Official impl. of "MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal Generation…
☆19Apr 15, 2025Updated last year
mayubo2333 / MMLongBench-Doc
View on GitHub
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆149Sep 28, 2025Updated 9 months ago
aiming-lab / MDocAgent
View on GitHub
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
☆352Aug 8, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
llm-lab-org / Multimodal-RAG-Survey
View on GitHub
A Survey on Multimodal Retrieval-Augmented Generation
☆533Feb 20, 2026Updated 5 months ago
megagonlabs / starmie
View on GitHub
Resources for PVLDB 2023 submission
☆29Aug 28, 2024Updated last year
maziao / M2RAG
View on GitHub
Implementation of "Multi-modal Retrieval Augmented Multi-modal Generation: Datasets, Evaluation Metrics and Strong Baselines"
☆33Feb 24, 2025Updated last year
DMiC-Lab-HFUT / Query-Driven-Multimodal-GraphRAG
View on GitHub
☆18Feb 5, 2026Updated 5 months ago
OpenGVLab / Docopilot
View on GitHub
[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆37Jul 22, 2025Updated last year
google / spiqa
View on GitHub
Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]
☆76Jan 13, 2025Updated last year
HJYao00 / MMReason
View on GitHub
[ICCV 2025] MMReason, MLLMs, step by step, reasoning benchmark, AGI
☆15Apr 25, 2026Updated 3 months ago
SalesforceAIResearch / UniDoc-Bench
View on GitHub
☆38Jun 2, 2026Updated last month
Aeryn666 / RegionRAG
View on GitHub
[AAAI2026] Source code for RegionRAG
☆24Apr 20, 2026Updated 3 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
VLR-CVC / DocVQA2026
View on GitHub
Official evaluation scripts and baseline prompts for the DocVQA 2026 (ICDAR 2026) Competition on Multimodal Reasoning over Documents.
☆16Mar 16, 2026Updated 4 months ago
soyoung97 / ListT5
View on GitHub
official repository for ListT5
☆51Nov 27, 2025Updated 7 months ago
hongzhouyu / FineMed
View on GitHub
The codebase and some introductions of FineMed.
☆31Sep 11, 2025Updated 10 months ago
microsoft / SheetBrain
View on GitHub
☆31Apr 24, 2026Updated 3 months ago
SuDIS-ZJU / nlcTables
View on GitHub
☆15Jan 27, 2026Updated 5 months ago
slowfast-vgen / slowfast-vgen
View on GitHub
☆21Oct 31, 2024Updated last year
HKUSTDial / nvBench-2.0
View on GitHub
🔥 [NeurIPS'25] nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
☆26Nov 13, 2025Updated 8 months ago
ziaoang / AutoRec
View on GitHub
☆13Jun 17, 2016Updated 10 years ago
sheetagent / sheetagent.github.io
View on GitHub
☆14Apr 25, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
EmergingUnicorns / DeepPaint
View on GitHub
☆11Oct 22, 2023Updated 2 years ago
razvan404 / multimodal-speech-emotion-recognition
View on GitHub
Multimodal SER Model meant to be trained on recognising emotions from speech (text + acoustic data). Fine-tuned the DeBERTaV3 model, resp…
☆11Jun 19, 2024Updated 2 years ago
HKUDS / RCL
View on GitHub
[Recsys'2023] "RCL: Multi-Relational Contrastive Learning for Recommendation"
☆16Sep 6, 2023Updated 2 years ago
jiangshdd / ReviewCritique
View on GitHub
☆13Sep 26, 2024Updated last year
Qyu-ai / Reina
View on GitHub
PySpark-based causal inference package.
☆13Aug 20, 2021Updated 4 years ago
bloomberg / m3docrag
View on GitHub
☆71May 19, 2025Updated last year
THU-KEG / ReaRAG
View on GitHub
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation
☆28Aug 24, 2025Updated 11 months ago
strikingly / blog
View on GitHub
☆10Jan 28, 2016Updated 10 years ago
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LCBHSStudent / fvck-this-term-collection-BUPT
View on GitHub
Collection of course design during the 2nd term of GRADE 2 in CS BUPT
☆13Sep 11, 2020Updated 5 years ago
CreaLabs / Enhanced-BGE-M3-with-CLP-and-MoE
View on GitHub
This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…
☆11Dec 27, 2024Updated last year
DataArcTech / RagVL
View on GitHub
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆92Nov 15, 2024Updated last year
D0miH / does-clip-know-my-face
View on GitHub
Source Code for the JAIR Paper "Does CLIP Know my Face?" (Demo: https://huggingface.co/spaces/AIML-TUDA/does-clip-know-my-face)
☆15Jul 9, 2024Updated 2 years ago
karan1149 / fake-news-detector-extension
View on GitHub
Chrome extension for machine-learning powered fake news detector.
☆11Sep 15, 2017Updated 8 years ago
winter1203 / vllm_GOT2_OCR
View on GitHub
Accelerating GOT-OCRv2 with VLLM
☆10Nov 15, 2024Updated last year
taoszhang / MMhops-R1
View on GitHub
MMhops-R1: Multimodal Multi-hop Reasoning
☆16Feb 28, 2026Updated 4 months ago