[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆37Jul 22, 2025Updated 9 months ago
Alternatives and similar repositories for Docopilot
Users that are interested in Docopilot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆12Mar 27, 2025Updated last year
- ☆26Nov 17, 2025Updated 6 months ago
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆92Nov 15, 2024Updated last year
- Video Benchmark Suite: Rapid Evaluation of Video Foundation Models☆17Jan 10, 2025Updated last year
- ☆14Jan 26, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch implementation of the article "Generative Adversarial Network for Handwritten Text"☆10Nov 13, 2023Updated 2 years ago
- This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy☆51Oct 16, 2024Updated last year
- ATM-Bench: A benchmark for long-term personalized memory QA spanning ~4 years of multimodal data (images, videos, emails). Features refer…☆43Apr 10, 2026Updated last month
- MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition☆10Mar 19, 2025Updated last year
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day☆29Feb 5, 2025Updated last year
- Official release of Genos models.☆22Jan 30, 2026Updated 3 months ago
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 10 months ago
- ☆19May 19, 2024Updated 2 years ago
- ☆70May 19, 2025Updated last year
- ☆47Apr 4, 2026Updated last month
- Targeted synthesis of multi-temporal remote sensing images for change detection using siamese neural networks☆24Feb 15, 2019Updated 7 years ago
- ☆16Oct 6, 2024Updated last year
- Just prepare config file and start training your metric learning model with ease☆16Apr 2, 2024Updated 2 years ago
- ☆16Jan 19, 2024Updated 2 years ago
- SimKO: Simple Pass@K Policy Optimization☆31Oct 24, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"☆15Aug 26, 2025Updated 8 months ago
- FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models☆13Dec 21, 2025Updated 4 months ago
- [NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference☆20Jun 19, 2025Updated 11 months ago
- The official repo for the DanQing dataset.☆35Mar 25, 2026Updated last month
- [🎖️1등(장관상) 솔루션] 2022 국립국어원 인공 지능 언어 능력 평가 (쇼핑몰 리뷰 데이터 속성 기반 감성 분석 : Aspect-Based Sentiment Analysis)☆11Jun 6, 2023Updated 2 years ago
- ☆12Aug 10, 2022Updated 3 years ago
- ☆18Mar 19, 2023Updated 3 years ago
- Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval☆38Aug 4, 2025Updated 9 months ago
- WebRTC demo☆34Jan 31, 2013Updated 13 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code for the MTEB leaderboard☆30Feb 4, 2025Updated last year
- EMNLP 2024 | Style-Specific Neurons for Steering LLMs in Text Style Transfer☆13Mar 23, 2025Updated last year
- ICCV 2025: Official Implematation of "Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced L…☆72Oct 25, 2025Updated 6 months ago
- A clone from Max Jaderberg's Text Renderer☆34Jun 16, 2016Updated 9 years ago
- A Qt5 app that plots timestamped MQTT data – status: unfinished alpha software.☆10May 7, 2022Updated 4 years ago
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆67May 26, 2025Updated 11 months ago
- Repository for augmenting data in forms, invoices and receipts for document image understanding☆17May 6, 2021Updated 5 years ago