Improved file parsing for LLM’s
☆3,154Nov 13, 2024Updated last year
Alternatives and similar repositories for open-parse
Users that are interested in open-parse are comparing it to the libraries listed below
Sorting:
- UniTable: Towards a Unified Table Foundation Model☆528Jun 4, 2024Updated last year
- Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…☆14,214Mar 4, 2026Updated 2 weeks ago
- Convert PDF to markdown + JSON quickly with high accuracy☆32,617Mar 10, 2026Updated last week
- High-performance retrieval engine for unstructured data☆1,566Nov 10, 2025Updated 4 months ago
- Structured Outputs☆13,564Mar 9, 2026Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,477Mar 1, 2026Updated 2 weeks ago
- Developer APIs to Accelerate LLM Projects☆1,744Oct 18, 2024Updated last year
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,291Updated this week
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.☆1,276Mar 28, 2025Updated 11 months ago
- Supercharge Your LLM Application Evaluations 🚀☆13,008Feb 24, 2026Updated 3 weeks ago
- Knowledge Agents and Management in the Cloud☆4,248Updated this week
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆7,728Nov 7, 2025Updated 4 months ago
- DSPy: The framework for programming—not prompting—language models☆32,853Updated this week
- Structured data extraction and instruction calling with ML, LLM and Vision LLM☆5,133Updated this week
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,809Dec 12, 2025Updated 3 months ago
- A Repo For Document AI☆3,147Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,329Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,882May 17, 2025Updated 10 months ago
- structured outputs for llms☆12,551Updated this week
- LlamaIndex is the leading document agent and OCR platform☆47,753Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆10,284May 8, 2025Updated 10 months ago
- Get your documents ready for gen AI☆55,944Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆39,597Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system