DataEval / dingo
Dingo: A Comprehensive Data Quality Evaluation Tool
☆36Updated this week
Alternatives and similar repositories for dingo:
Users that are interested in dingo are comparing it to the libraries listed below
- [ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models☆339Updated 11 months ago
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆21Updated 2 months ago
- OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆65Updated 3 weeks ago
- Enhance LLM agents with rich tool APIs☆371Updated 5 months ago
- conversion doc(pdf/html/doc/docx/ppt/pptx)to markdown☆37Updated 6 months ago
- Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊☆263Updated 3 weeks ago
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆215Updated last week
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆223Updated 2 weeks ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆139Updated 8 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆214Updated last week
- ☆27Updated 6 months ago
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆259Updated 10 months ago
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆348Updated this week
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆46Updated 3 months ago
- The Open-Source Data Annotation Platform☆663Updated this week
- ☆79Updated 2 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆69Updated 5 months ago
- 【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling☆111Updated 4 months ago
- Evaluating LLMs' multi-round chatting capability via assessing conversations generated by two LLM instances.☆144Updated last year
- LLM Group Chat Framework: chat with multiple LLMs at the same time. 大模型群聊框架:同时与多个大语言模型聊天。☆270Updated 10 months ago
- ☆168Updated 2 months ago
- ☆104Updated last year
- datasets resource☆102Updated 6 months ago
- ☆56Updated last year
- ☆170Updated 2 weeks ago
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆60Updated 7 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 10 months ago
- ☆225Updated 9 months ago
- ☆52Updated 5 months ago