[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆37Jul 22, 2025Updated 9 months ago
Alternatives and similar repositories for Docopilot
Users that are interested in Docopilot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated last year
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆92Nov 15, 2024Updated last year
- Video Benchmark Suite: Rapid Evaluation of Video Foundation Models☆16Jan 10, 2025Updated last year
- Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)☆32Oct 26, 2025Updated 6 months ago
- This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy☆51Oct 16, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ATM-Bench: A benchmark for long-term personalized memory QA spanning ~4 years of multimodal data (images, videos, emails). Features refer…☆39Apr 10, 2026Updated 2 weeks ago
- MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition☆10Mar 19, 2025Updated last year
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- The official repository of MM-R5☆29Jun 22, 2025Updated 10 months ago
- [一个聊天软件Demo] a chat software powered by libevent/mysql and qt☆10Sep 10, 2021Updated 4 years ago
- ☆11Apr 26, 2019Updated 7 years ago
- Large-scale text embedding model☆39Sep 6, 2025Updated 7 months ago
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 10 months ago