ccprocessor / llm-webkit-mirrorLinks

☆23

Alternatives and similar repositories for llm-webkit-mirror

Users that are interested in llm-webkit-mirror are comparing it to the libraries listed below

Sorting:

opendatalab / Miner-PDF-Benchmark
MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.
☆23Updated 11 months ago
opendatalab / mineru-vl-utils
A Python package for interacting with the MinerU Vision-Language Model.
☆69Updated last week
LingyvKong / OneChart
[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
☆254Updated 7 months ago
liunian-Jay / MU-GOT
PDF解析工具：GOT的vLLM加速实现，MinerU做布局识别裁剪、GOT做表格公式解析，实现RAG中的pdf解析
☆66Updated last year
SpursGoZmy / Table-LLaVA
Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …
☆220Updated 5 months ago
ucaslcl / Fox
official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"
☆184Updated last year
GAIR-NLP / ProX
[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
☆263Updated 4 months ago
CASIA-LM / ChineseWebText
☆180Updated 2 years ago
microsoft / RedStone
The RedStone repository includes code for preparing extensive datasets used in training large language models.
☆145Updated 4 months ago
PKU-Baichuan-MLSystemLab / PAS
☆54Updated last year
FlagOpen / Infinity-Instruct
☆49Updated last year
a-m-team / a-m-models
a-m-team's exploration in large language modeling
☆192Updated 5 months ago
harrytea / Awesome-Document-Understanding
Document Artifical Intelligence
☆192Updated last month
infly-ai / INF-MLLM
☆103Updated this week
LukeForeverYoung / UReader
☆142Updated last year
Ucas-HaoranWei / Vary-tiny-600k
Vary-tiny codebase upon LAVIS （for training from scratch）and a PDF image-text pairs data (about 600k including English/Chinese)
☆86Updated last year
Alpha-Innovator / StructEqTable-Deploy
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆267Updated 11 months ago
opendatalab / opendatalab-datasets
datasets resource
☆125Updated 4 months ago
InternLM / Agent-FLAN
[ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models
☆355Updated last year
SuperGPQA / SuperGPQA
☆172Updated 6 months ago
opendatalab / WanJuan1.0
万卷1.0多模态语料
☆567Updated 2 years ago
tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆182Updated 4 months ago
modelscope / easydistill
a toolkit on knowledge distillation for large language models
☆200Updated 2 weeks ago
open-compass / T-Eval
[ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step
☆297Updated last year
LaVi-Lab / CLEVA
[EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"
☆62Updated 6 months ago
westlake-baichuan-mllm / bc-omni
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
☆269Updated 9 months ago
MigoXLab / dingo
Dingo: A Comprehensive AI Data Quality Evaluation Tool
☆564Updated this week
Alibaba-NLP / OmniSearch
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
☆393Updated 7 months ago
thu-coai / CritiqueLLM
☆147Updated last year
QwenLM / AutoIF
☆314Updated last year