ccprocessor / llm-webkit-mirrorLinks
☆23Updated 2 weeks ago
Alternatives and similar repositories for llm-webkit-mirror
Users that are interested in llm-webkit-mirror are comparing it to the libraries listed below
Sorting:
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆23Updated 11 months ago
- A Python package for interacting with the MinerU Vision-Language Model.☆69Updated last week
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆254Updated 7 months ago
- PDF解析工具:GOT的vLLM加速实现,MinerU做布局识别裁剪、GOT做表格公式解析,实现RAG中的pdf解析☆66Updated last year
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆220Updated 5 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆184Updated last year
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆263Updated 4 months ago
- ☆180Updated 2 years ago
- The RedStone repository includes code for preparing extensive datasets used in training large language models.☆145Updated 4 months ago
- ☆54Updated last year
- ☆49Updated last year
- a-m-team's exploration in large language modeling☆192Updated 5 months ago
- Document Artifical Intelligence☆192Updated last month
- ☆103Updated this week
- ☆142Updated last year
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆86Updated last year
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆267Updated 11 months ago
- datasets resource☆125Updated 4 months ago
- [ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models☆355Updated last year
- ☆172Updated 6 months ago
- 万卷1.0多模态语料☆567Updated 2 years ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆182Updated 4 months ago
- a toolkit on knowledge distillation for large language models☆200Updated 2 weeks ago
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆297Updated last year
- [EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"☆62Updated 6 months ago
- Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊☆269Updated 9 months ago
- Dingo: A Comprehensive AI Data Quality Evaluation Tool☆564Updated this week
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆393Updated 7 months ago
- ☆147Updated last year
- ☆314Updated last year