adoresever / Pretuning
A tool for creating pre-training datasets for language models, supporting one-click batch processing for both text and image datasets. 一个专为语言模型预训练设计的数据集制作工具,支持文本和图像数据集的一键式批量处理。
☆32Updated 4 months ago
Alternatives and similar repositories for Pretuning
Users that are interested in Pretuning are comparing it to the libraries listed below
Sorting:
- LightRAG与GraphRAG在索引构建、检索测试中的耗时、模型请求次数、Token消耗金额、检索质量等方面进行对比☆87Updated 5 months ago
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆107Updated 8 months ago
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆251Updated last month
- Scenario-based large model testing toolbox☆24Updated 9 months ago
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆59Updated last month
- 利用免费的大模型api来结合你的私域数据来生成sft训练数据(妥妥白嫖)支持llamafactory等工具的训练数据格式synthetic data☆159Updated 5 months ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆57Updated 3 months ago
- ✨🦋 illufly - 【幻蝶】基于记忆蒸馏、资料检索的自我进化智能体☆60Updated last week
- generate ppt with llm☆91Updated last year
- Official code for Dynamic Parametric RAG.☆107Updated last week
- 文本语料转训练集工具,txt转dataset☆92Updated last year
- 支持中文🇨🇳🇨🇳🇨🇳 的 microsoft/graphrag☆44Updated last month
- 本项目主要介绍prompt工程相关用例。包括模拟智能推荐客服系统构建和问答、思维链、自洽性、思维树等相关进阶demo,旨在帮助大家理解prompt。通过一份代码实现了同时支持多种大模型(如OpenAI、阿里通义千问等)并使用FastAPI对应用进行API封装。☆30Updated 7 months ago
- ☆94Updated last month
- Chat2Graph: Graph Native Agentic System.☆128Updated last week
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆92Updated last month
- llamafactory blog☆25Updated 6 months ago
- MinerU API server☆53Updated 4 months ago
- ☆246Updated 4 months ago
- A LLM RAG system runs on your laptop. 大模型检索增强生成系统,可以轻松部署在笔记本电脑上,实现本地知识库智能问答。☆228Updated 3 months ago
- 利用多Agent对区域进行地址提取☆27Updated last month
- GOT-OCR的GUI版本,提供OCR、导出PDF、批处理等功能,但不提供训练功能☆169Updated this week
- GraphRAG的应用实例,项目特点在于提供了替换OpenAI模型的方法,并通过修改原有提示和切分文档的方法,提高了GraphRAG处理中文内容的能力。☆151Updated 6 months ago
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR和TTS的开源框架。☆177Updated this week
- Neo4j graph construction from unstructured data☆304Updated 9 months ago
- A code executor for Dify that is compatible with the official sandbox API calls and dependency installation.☆249Updated 2 weeks ago
- ☆109Updated 9 months ago
- 本项目旨在提供一个微调酒店推荐垂直领域大模型并应用的完整闭环案例作为大家的参考案例。本项目使用的基础大模型为Qwen2.5-7B-Instruct。项目特色:完整的垂直应用案例闭环、项目源码剖析开源共享、详实的图文指导手册、手把手全流程实操演示视频☆35Updated 2 weeks ago
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆135Updated this week
- A open version Manus.☆58Updated last month