[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆37Jul 22, 2025Updated 10 months ago
Alternatives and similar repositories for Docopilot
Users that are interested in Docopilot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆12Mar 27, 2025Updated last year
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆92Nov 15, 2024Updated last year
- ☆11Oct 31, 2024Updated last year
- Video Benchmark Suite: Rapid Evaluation of Video Foundation Models☆17Jan 10, 2025Updated last year
- ☆14Jan 26, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy☆51Oct 16, 2024Updated last year
- ATM-Bench: A benchmark for long-term personalized memory QA spanning ~4 years of multimodal data (images, videos, emails). Features refer…☆45Jun 1, 2026Updated last week
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆60Aug 5, 2025Updated 10 months ago
- V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day☆30Feb 5, 2025Updated last year
- The official repository of MM-R5☆29Jun 22, 2025Updated 11 months ago
- [一个聊天软件Demo] a chat software powered by libevent/mysql and qt☆10Sep 10, 2021Updated 4 years ago
- A High-Quality Diabetic Retinopathy Pixel-Level Annotation Dataset☆17Dec 9, 2025Updated 6 months ago
- serverless vscode webide☆17Apr 14, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Handwriting Trajectory Recovery using End-to-End Deep Encoder-Decoder Network, ICPR 2018.☆15Jul 17, 2019Updated 6 years ago
- Targeted synthesis of multi-temporal remote sensing images for change detection using siamese neural networks☆24Feb 15, 2019Updated 7 years ago
- ☆16Oct 6, 2024Updated last year
- 同济大学简历模版,做了一点点本地化修改 (generated from fky2015/resume-ng)☆17Dec 3, 2023Updated 2 years ago
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 7 months ago
- [EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"☆15Aug 26, 2025Updated 9 months ago
- FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models☆13Dec 21, 2025Updated 5 months ago
- The official repo for the DanQing dataset.☆36Mar 25, 2026Updated 2 months ago
- [🎖️1등(장관상) 솔루션] 2022 국립국어원 인공 지능 언어 능력 평가 (쇼핑몰 리뷰 데이터 속성 기반 감성 분석 : Aspect-Based Sentiment Analysis)☆11Jun 6, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Aug 10, 2022Updated 3 years ago
- ☆18Mar 19, 2023Updated 3 years ago
- WebRTC demo☆34Jan 31, 2013Updated 13 years ago
- Code for the MTEB leaderboard☆31Feb 4, 2025Updated last year
- EMNLP 2024 | Style-Specific Neurons for Steering LLMs in Text Style Transfer☆13Mar 23, 2025Updated last year
- ☆43Jan 9, 2026Updated 4 months ago
- ☆23Apr 23, 2019Updated 7 years ago
- Bezier AE approach to sketch generation☆31Jul 7, 2020Updated 5 years ago
- 电梯调度,操作系统课程 作业☆18Jun 26, 2018Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆28May 14, 2025Updated last year
- ICCV 2025: Official Implematation of "Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced L…☆72Oct 25, 2025Updated 7 months ago
- 操作系统内存管理项目☆14Jun 5, 2021Updated 5 years ago
- A Qt5 app that plots timestamped MQTT data – status: unfinished alpha software.☆10May 7, 2022Updated 4 years ago
- ☆17Apr 8, 2026Updated 2 months ago
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆66May 26, 2025Updated last year
- Repository for augmenting data in forms, invoices and receipts for document image understanding☆17May 6, 2021Updated 5 years ago