DataEval / dingo
Dingo: A Comprehensive Data Quality Evaluation Tool
☆142Updated this week
Alternatives and similar repositories for dingo:
Users that are interested in dingo are comparing it to the libraries listed below
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆58Updated last month
- ☆323Updated 10 months ago
- OpenSearch-SQL code☆106Updated last week
- ROGRAG: A Robustly Optimized GraphRAG Framework☆123Updated this week
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆119Updated this week
- Analysis of Chinese and English layouts 中英文版面分析☆207Updated last month
- Synthesizing High-quality Text-to-SQL Data at Scale. SynSQL-2.5M is the first million-scale cross-domain text-to-SQL dataset.☆240Updated last month
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆92Updated last month
- 利用免费的大模型api来结合你的私域数据来生成sft训练数据(妥妥白嫖)支持llamafactory等工具的训练数据格式synthetic data☆159Updated 5 months ago
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆462Updated last month
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR和TTS的开源框架。☆173Updated last week
- An easy-to-use framework for modular RAG☆356Updated this week
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆280Updated 8 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆163Updated last week
- dify's rag patch module☆224Updated last month
- unify-easy-llm(ULM)旨在打造一个简易的一键式大模型训练工具,支持Nvidia GPU、Ascend NPU等不同硬件以及常用的大模型。☆55Updated 9 months ago
- A Model Context Protocol (MCP) server that enables natural language queries to databases☆114Updated 2 weeks ago
- Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 轻松构建智能、具备反思能力、可协作的多模态AI Agent。☆155Updated this week
- conversion doc(pdf/html/doc/docx/ppt/pptx)to markdown☆42Updated 9 months ago
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆295Updated last month
- The Open-Source Data Annotation Platform☆806Updated 2 months ago
- Chat2Graph: Graph Native Agentic System.☆107Updated last week
- ☆58Updated 6 months ago
- ☆86Updated last month
- ☆246Updated 4 months ago
- ☆489Updated 9 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆92Updated 5 months ago
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆235Updated 6 months ago
- 基于 Dify + Langfuse 的自动化评估服务☆57Updated 2 weeks ago
- bisheng-unstructured library☆46Updated last week