Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
☆14,074Updated this week
Alternatives and similar repositories for unstructured
Users that are interested in unstructured are comparing it to the libraries listed below
Sorting:
- LlamaIndex is the leading document agent and OCR platform☆47,210Updated this week
- DSPy: The framework for programming—not prompting—language models☆32,381Updated this week
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆24,295Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆37,083Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆12,736Updated this week
- structured outputs for llms☆12,428Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆31,031Feb 20, 2026Updated last week
- Universal memory layer for AI Agents☆47,994Updated this week
- 🦜🔗 The platform for reliable agents.☆127,192Updated this week
- A programming framework for agentic AI☆54,956Jan 22, 2026Updated last month
- Structured Outputs☆13,456Feb 13, 2026Updated 2 weeks ago
- Convert PDF to markdown + JSON quickly with high accuracy☆31,857Feb 9, 2026Updated 2 weeks ago
- Get your documents ready for gen AI☆54,094Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆71,234Updated this week
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,210Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.☆21,214Jan 29, 2026Updated last month
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆44,662Updated this week
- The programming language for agentic software. Build, run, and manage multi-agent systems at scale.☆38,104Updated this week
- Open-source search and retrieval database for AI applications.☆26,269Updated this week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆22,415Updated this week
- A guidance language for controlling large language models.☆21,319Feb 13, 2026Updated 2 weeks ago
- Build AI Agents, Visually☆49,310Updated this week
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆73,900Updated this week
- Build Conversational AI in minutes ⚡️☆11,613Feb 18, 2026Updated last week
- Knowledge Agents and Management in the Cloud☆4,235Feb 17, 2026Updated last week
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with struc…☆15,690Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,360Updated this week
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cl…☆29,102Updated this week
- An autonomous agent that conducts deep research on any data using any LLM providers.☆25,376Feb 21, 2026Updated last week
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.☆52,724Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,574Jul 14, 2025Updated 7 months ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆7,943Jul 11, 2025Updated 7 months ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,875Updated this week
- Langflow is a powerful tool for building and deploying AI-powered agents and workflows.☆145,034Updated this week
- An open-source RAG-based tool for chatting with your documents.☆25,152Jul 4, 2025Updated 7 months ago
- 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data☆84,899Feb 22, 2026Updated last week
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,731Feb 9, 2026Updated 2 weeks ago
- Large Language Model Text Generation Inference☆10,774Jan 8, 2026Updated last month
- Retrieval and Retrieval-augmented LLMs☆11,329Dec 15, 2025Updated 2 months ago