[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆36Jul 22, 2025Updated 7 months ago
Alternatives and similar repositories for Docopilot
Users that are interested in Docopilot are comparing it to the libraries listed below
Sorting:
- ☆11Oct 31, 2024Updated last year
- Math24o: 高中奥林匹克数学竞赛 测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 11 months ago
- MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition☆10Mar 19, 2025Updated 11 months ago
- ☆21Nov 17, 2025Updated 3 months ago
- Video Benchmark Suite: Rapid Evaluation of Video Foundation Models☆15Jan 10, 2025Updated last year
- ☆61Aug 5, 2025Updated 6 months ago
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆91Nov 15, 2024Updated last year
- Rubik ESP32 esp-idf Device driver library.☆12Jul 3, 2021Updated 4 years ago
- V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day☆29Feb 5, 2025Updated last year
- Evaluation framework for document processing models and services.☆63Feb 12, 2026Updated 2 weeks ago
- Verilog code for a low power RFID chip that will communicate with I2C sensors.☆13Apr 18, 2014Updated 11 years ago
- FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models☆10Dec 21, 2025Updated 2 months ago
- C++ code and MATLAB utilities for loading patterns onto TI DLP Digital Micromirror Device (DMD)☆14Dec 19, 2020Updated 5 years ago
- Code from the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models"☆121Jan 6, 2026Updated last month
- Just prepare config file and start training your metric learning model with ease☆16Apr 2, 2024Updated last year
- High-resolution time-to-digital converter in the Red Pitaya Zynq-7010 SoC☆10Jul 12, 2020Updated 5 years ago
- Official Repository of RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning☆14Jul 9, 2025Updated 7 months ago
- ☆18Feb 16, 2025Updated last year
- ☆22Dec 11, 2025Updated 2 months ago
- EBAZ4205 Board FPGA project☆14Oct 20, 2023Updated 2 years ago
- Data Programming for Text Detection in Documents using SPEAR☆12Mar 26, 2025Updated 11 months ago
- Light Cube using PYNQ☆10Aug 4, 2018Updated 7 years ago
- A Unity project connecting to a local Pozyx MQTT positioning stream.☆10Sep 13, 2019Updated 6 years ago
- 🧠🖼️🐍 A Python wrapper around the BrainFrame REST API☆12Jan 7, 2025Updated last year
- ☆13Aug 19, 2025Updated 6 months ago
- TensorRT In Docker☆11Dec 7, 2024Updated last year
- Implementation of various handwritten text line segmentation☆10Jan 6, 2020Updated 6 years ago
- [EMNLP 2025] The official implementation of "Zero-shot Multimodal Document Retrieval via Cross-Modal Question Generation"☆15Aug 26, 2025Updated 6 months ago
- ICDAR 2021 Competition on Scientific Literature Parsing☆35Aug 20, 2020Updated 5 years ago
- ☆102Dec 23, 2024Updated last year
- ICDAR 2024 Table OCR Model☆39Feb 20, 2026Updated last week
- ☆10Jul 6, 2015Updated 10 years ago
- 基于odoo12集成阿里钉钉产品☆10Apr 24, 2019Updated 6 years ago
- Quantization of LLMs and benchmarking.☆10Apr 3, 2024Updated last year
- WebRTC demo☆34Jan 31, 2013Updated 13 years ago
- Large-scale text embedding model☆38Sep 6, 2025Updated 5 months ago
- novel-write 的Openspec 方法论版本的探索☆18Oct 24, 2025Updated 4 months ago
- character recognition, textline recognition☆10Aug 31, 2019Updated 6 years ago
- ☆21Jun 16, 2025Updated 8 months ago