ccprocessor / llm-webkit-mirrorLinks
☆19Updated this week
Alternatives and similar repositories for llm-webkit-mirror
Users that are interested in llm-webkit-mirror are comparing it to the libraries listed below
Sorting:
- Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …☆209Updated last month
- ☆172Updated last year
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆23Updated 7 months ago
- [ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models☆350Updated last year
- datasets resource☆117Updated 2 weeks ago
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆60Updated 8 months ago
- 万卷1.0多模态语料☆561Updated last year
- Dingo: A Comprehensive AI Data Quality Evaluation Tool☆288Updated this week
- ☆345Updated last year
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆280Updated last year
- Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)☆47Updated last year
- SDK of OpenDataLab - https://opendatalab.org.cn☆57Updated last year
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆349Updated 2 months ago
- ☆230Updated last year
- The Open-Source Data Annotation Platform☆868Updated 4 months ago
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆224Updated 3 months ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆251Updated 7 months ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆291Updated 10 months ago
- 大模型多维度中文对齐评测基准 (ACL 2024)☆398Updated 11 months ago
- a-m-team's exploration in large language modeling☆173Updated last month
- A live reading list for LLM-synthetic-data.☆308Updated last week
- 【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling☆121Updated last month
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆151Updated last year
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆602Updated last week
- On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)☆656Updated last week
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆263Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆162Updated 3 weeks ago
- ☆48Updated last year
- A knowledge base backend system for LLMs with full-text search, semantic retrieval, and knowledge graph querying. Ready-to-use modules fo…☆28Updated 3 months ago
- 大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning☆67Updated 11 months ago