hwb96 / markdown-structure-splitterLinks
一个为RAG系统设计的Markdown文档工具,提供标题结构自动抽取和文档分割两大功能。完整保留文档层级结构,解决传统切分器丢失标题层级与破坏表格完整性的问题。A hierarchy-preserving Markdown document splitter for RAG (Retrieval-Augmented Generation) systems that maintains document structure and table integrity.
☆12Updated last year
Alternatives and similar repositories for markdown-structure-splitter
Users that are interested in markdown-structure-splitter are comparing it to the libraries listed below
Sorting:
- 基于 Dify + Langfuse 的自动化评估服务☆88Updated 8 months ago
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆132Updated 10 months ago
- Agentica: Effortlessly Build Intelligent, Reflective, and Collaborative Multimodal AI Agents! 构建智能的多模态AI Agent。☆244Updated this week
- Intelligent data apps and assets with LLMs☆186Updated 11 months ago
- ☆46Updated 9 months ago
- 筱可的工程实验仓库!☆109Updated 3 months ago
- 在RAG技术中,嵌入向量的生成和匹配是关键环节。本文介绍了一种基于CLIP/BLIP模型的嵌入服务,该服务支持文本和图像的嵌入生成与相似度计算,为多模态信息检索提供了基础能力。☆42Updated last year
- MinerU API server☆85Updated last year
- A collection of RAG systems powered by LLM.☆216Updated 11 months ago
- XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced Retrieval-Augmented Generation☆116Updated this week
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。☆244Updated last week
- PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取☆213Updated 2 years ago
- TianGong-AI-Unstructure☆69Updated this week
- MCP Agent Graph is a Multi-Agent System built on the principles of Context Engineering☆187Updated last month
- [EACL'26] DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router☆105Updated last month
- 本项目主要介绍prompt工程相关用例。包括模拟智能推荐客服系统构建和问答、思维链、自洽性、思维树等相关进阶demo,旨在帮助大家理解prompt。通过一份代码实现了同时支持多种大模型(如OpenAI、阿里通义千问等)并使用FastAPI对应用进行API封装。☆51Updated last year
- OpenSearch-SQL code☆166Updated 8 months ago
- LightRAG与GraphRAG在索引构建、检索测试中的耗时、模型请求次数、Token消耗金额、检索质量等方面进行对比☆159Updated last year
- A method and corresponding code for automatic description generation for Text-to-SQL☆107Updated 5 months ago
- 通过paddle ocr实现pdf转markdown☆79Updated last year
- 基于Qwen2模型进行通用信息抽取【实体/关系/事件抽取】☆40Updated last year
- bisheng-unstructured library☆57Updated 8 months ago
- dify's rag patch module☆277Updated 5 months ago
- DSPy中文文档☆48Updated last year
- 探索 LLM 在法律行业的应用潜力☆96Updated last year
- SDK for Dify plugins☆123Updated this week
- E2M API, converting everything to markdown (LLM-friendly Format).☆139Updated last year
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆62Updated 10 months ago
- [ACL2025 demo track] ROGRAG: A Robustly Optimized GraphRAG Framework☆194Updated last month
- ☆62Updated 11 months ago