DataEval / dingo
Dingo: A Comprehensive Data Quality Evaluation Tool
☆97Updated this week
Alternatives and similar repositories for dingo:
Users that are interested in dingo are comparing it to the libraries listed below
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆57Updated 2 weeks ago
- The Open-Source Data Annotation Platform☆750Updated last month
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆81Updated 4 months ago
- 顾名思义:手搓的RAG☆121Updated last year
- ☆85Updated 3 weeks ago
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆223Updated 5 months ago
- ☆314Updated 9 months ago
- dify's rag patch module☆196Updated 2 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆136Updated this week
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆74Updated this week
- In this fast-paced world, we all need a little something to spice up life. Whether you need a glass of sweet talk to lift your spirits or…☆50Updated 2 months ago
- Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 轻松构建智能、具备反思能力、可协作的多模态AI Agent。☆144Updated 3 weeks ago
- Analysis of Chinese and English layouts 中英文版面分析☆185Updated this week
- A Multi-modal RAG Project with Dataset from Honor of Kings, one of the most popular smart phone games in China☆63Updated 7 months ago
- OpenSearch-SQL code☆83Updated 3 weeks ago
- ☆227Updated 3 months ago
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆412Updated last week
- Synthesizing High-quality Text-to-SQL Data at Scale. SynSQL-2.5M is the first million-scale cross-domain text-to-SQL dataset.☆157Updated 2 weeks ago
- ThinkLLM:大语言模型算法与组件实现☆26Updated last week
- RAG-QA-Generator 是一个用于检索增强生成(RAG)系统的自动化知识库构建与管理工具。该工具通过读取文档数据,利用大规模语言模型生成高质量的问答对(QA对),并将这些数据插入数据库中,实现RAG系统知识库的自动化构建和管理。☆140Updated 3 months ago
- ☆55Updated this week
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆271Updated 6 months ago
- gpt_server是一个用于生产级部署LLMs或Embedding的开源框架。☆163Updated this week
- ✨🦋 illufly 是自我进化的 Agent 框架: 基于自我进化,快速创造价值☆60Updated this week
- conversion doc(pdf/html/doc/docx/ppt/pptx)to markdown☆38Updated 8 months ago
- An easy-to-use framework for modular RAG☆341Updated this week
- GraphRAG的应用实例,项目特点在于提供了替换OpenAI模型的方法,并通过修改原有提示和切分文档的方法,提高了GraphRAG处理中文内容的能力。☆131Updated 4 months ago
- HuixiangDou2: A Robustly Optimized GraphRAG Approach☆96Updated this week
- 利用免费的大模型api来结合你的私域数据来生成sft训练数据(妥妥白嫖)支持llamafactory等工具的训练数据格式synthetic data☆154Updated 4 months ago
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆21Updated 3 months ago