OpenGVLab/Docopilot

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenGVLab/Docopilot)

OpenGVLab / Docopilot

[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding

☆37

Alternatives and similar repositories for Docopilot

Users that are interested in Docopilot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Mungeryang / colqwen3
View on GitHub
The code used to train and run inference with the ColQwen3 model. Welcome to follow and star! ⭐️⭐️⭐️ https://huggingface.co/goodman2001/…
☆15Jul 4, 2026Updated 2 weeks ago
DataArcTech / RagVL
View on GitHub
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆92Nov 15, 2024Updated last year
Alpha-Innovator / DocParser
View on GitHub
☆18Jan 13, 2025Updated last year
felix-schmitt / MathNet
View on GitHub
MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition
☆10Mar 19, 2025Updated last year
193746 / VHASR
View on GitHub
☆11Oct 31, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nttstar / inswapper-512-live
View on GitHub
☆14Jan 26, 2025Updated last year
anxiangsir / Video_Benchmark_Suite
View on GitHub
Video Benchmark Suite: Rapid Evaluation of Video Foundation Models
☆17Jan 10, 2025Updated last year
ctu-vras / lidar-intensity
View on GitHub
☆12Jan 14, 2021Updated 5 years ago
Episoode / Double-Bench
View on GitHub
[AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?
☆31Dec 14, 2025Updated 7 months ago
sakura2233565548 / TabPedia
View on GitHub
This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
☆51Oct 16, 2024Updated last year
mk-minchul / sapiensid
View on GitHub
☆26Nov 17, 2025Updated 8 months ago
Vespa314 / ZJUthesis
View on GitHub
浙大硕士毕业论文模板
☆10Mar 14, 2015Updated 11 years ago
Letian2003 / MM_INF
View on GitHub
An efficient multi-modal instruction-following data synthesis tool and the official implementation of Oasis https://arxiv.org/abs/2503.08…
☆40Jun 4, 2025Updated last year
nttmdlab-nlp / VDocRAG
View on GitHub
[CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents
☆66May 26, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lzyhha / HSSL
View on GitHub
Enhancing Representations through Heterogeneous Self-Supervised Learning (TPAMI 2025)
☆15May 2, 2025Updated last year
anxiangsir / V-SWIFT
View on GitHub
V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day
☆30Feb 5, 2025Updated last year
jina-ai / jina-vdr
View on GitHub
Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval
☆38Aug 4, 2025Updated 11 months ago
Jiaxing-star / LLaVA-Octopus
View on GitHub
☆11Jan 8, 2025Updated last year
ZX-Yin / DreamLifting
View on GitHub
The code implementation for the paper "DreamLifting: A Plug-in Module Lifting MV Diffusion Models for 3D Asset Generation".
☆30Sep 1, 2025Updated 10 months ago
1MENU / Korean_ABSA_model
View on GitHub
[🎖️1등(장관상) 솔루션] 2022 국립국어원 인공 지능 언어 능력 평가 (쇼핑몰 리뷰 데이터 속성 기반 감성 분석 : Aspect-Based Sentiment Analysis)
☆11Jun 6, 2023Updated 3 years ago
yh-hust / DocSeeker
View on GitHub
[CVPR 2026 Highlight] DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding
☆18Jun 4, 2026Updated last month
mvkolos / siamese-change-detection
View on GitHub
Targeted synthesis of multi-temporal remote sensing images for change detection using siamese neural networks
☆24Feb 15, 2019Updated 7 years ago
Ghy0501 / HiDe-LLaVA
View on GitHub
[ACL'25 Main] Official Implementation of HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Languag…
☆55Jun 1, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lyj20071013 / Triton-FlashAttention
View on GitHub
This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance im…
☆11Mar 26, 2026Updated 3 months ago
wangwangwang23333 / OS-FileManagement
View on GitHub
操作系统第三次课程项目，一个简单的文件系统
☆12Jun 24, 2021Updated 5 years ago
Fiquem / Expression-Packing
View on GitHub
Code for the Expression Packing algorithm to be published in Eurographics 2020
☆16May 27, 2020Updated 6 years ago
360CVGroup / RzenEmbed
View on GitHub
Embedding model prioritized towards Multimodal RAG, overall + VisDoc double top1 on MMEB benchmark
☆36Jun 16, 2026Updated last month
jiang-haoyuan / X-Light
View on GitHub
☆17Jan 19, 2024Updated 2 years ago
PietroMsn / Functional-skeleton-transfer
View on GitHub
☆15Feb 28, 2023Updated 3 years ago
anujanegi / VQA
View on GitHub
Visual Question Answering System
☆11Nov 13, 2019Updated 6 years ago
Akiya-Research-Institute / NNEngine-Demo
View on GitHub
Demo project for NNEngine
☆12Jun 26, 2025Updated last year
NEUIR / HIPPO
View on GitHub
HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization
☆18May 29, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Yuliang-Liu / MultimodalOCR
View on GitHub
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
☆870Updated this week
Ethanhuhuhu / KAC
View on GitHub
☆21Jul 22, 2025Updated 11 months ago
Oneflow-Inc / oneflow_face
View on GitHub
☆12Aug 10, 2022Updated 3 years ago
yuaanlin / zju-se-project
View on GitHub
浙江大学 2022 春夏《软件工程》课程期末作业。要求 5 人为一小组负责一个模块，5 个模块组成一个完整的医疗管理系统。
☆21Jun 18, 2022Updated 4 years ago
mahadi-nahid / TabSQLify
View on GitHub
[NAACL 2024] TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
☆18Jan 5, 2026Updated 6 months ago
Kwai-YuanQi / TaskGalaxy
View on GitHub
Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
☆32Jul 16, 2025Updated last year
SNU-VGILab / Liv3Stroke
View on GitHub
Official Repository of Recovering Dynamic 3D Sketches from Videos (CVPR 2025)
☆14Mar 2, 2026Updated 4 months ago