ccprocessor / llm-webkit-mirrorLinks
☆18Updated this week
Alternatives and similar repositories for llm-webkit-mirror
Users that are interested in llm-webkit-mirror are comparing it to the libraries listed below
Sorting:
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆23Updated 5 months ago
- ☆169Updated last year
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆282Updated 8 months ago
- SDK of OpenDataLab - https://opendatalab.org.cn☆57Updated last year
- datasets resource☆114Updated last month
- [ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models☆347Updated last year
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆57Updated 7 months ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆242Updated 5 months ago
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆224Updated last month
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆202Updated last month
- ☆228Updated last year
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆274Updated last year
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆468Updated 3 weeks ago
- ☆328Updated 11 months ago
- Dingo: A Comprehensive Data Quality Evaluation Tool☆168Updated last week
- 万卷1.0多模态语料☆561Updated last year
- ☆128Updated 3 weeks ago
- ☆63Updated 2 years ago
- MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval☆183Updated 2 weeks ago
- ☆141Updated last year
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆27Updated last week
- code for piccolo embedding model from SenseTime☆126Updated last year
- ☆135Updated last year
- CDLA: A Chinese document layout analysis (CDLA) dataset☆267Updated 3 years ago
- deepResearch☆38Updated last month
- Imitate OpenAI with Local Models☆87Updated 9 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆329Updated last month
- ☆47Updated 11 months ago
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆47Updated last year
- 文本去重☆72Updated last year