Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
☆14,536Apr 20, 2026Updated this week
Alternatives and similar repositories for unstructured
Users that are interested in unstructured are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LlamaIndex is the leading document agent and OCR platform☆48,790Updated this week
- DSPy: The framework for programming—not prompting—language models☆33,877Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆13,605Feb 24, 2026Updated 2 months ago
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆24,907Apr 17, 2026Updated last week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆44,344Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆32,334Apr 13, 2026Updated last week
- structured outputs for llms☆12,798Apr 16, 2026Updated last week
- The agent engineering platform☆133,997Updated this week
- Universal memory layer for AI Agents☆53,665Updated this week
- A programming framework for agentic AI☆57,354Apr 15, 2026Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆34,060Apr 14, 2026Updated last week
- Structured Outputs☆13,694Apr 16, 2026Updated last week
- Get your documents ready for gen AI☆58,234Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆77,531Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,406Apr 14, 2026Updated last week
- Data infrastructure for AI☆27,514Updated this week
- A guidance language for controlling large language models.☆21,397Apr 10, 2026Updated 2 weeks ago
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆49,480Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.☆22,141Apr 12, 2026Updated last week
- Build, run, manage agentic software at scale.☆39,518Apr 17, 2026Updated last week
- Knowledge Agents and Management in the Cloud☆4,248Apr 13, 2026Updated last week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆25,881Updated this week
- Build AI Agents, Visually☆52,052Updated this week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆78,904Updated this week
- Build Conversational AI in minutes ⚡️☆11,967Apr 9, 2026Updated 2 weeks ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,639Apr 10, 2026Updated 2 weeks ago
- ☆910Apr 13, 2026Updated last week
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with struc…☆16,062Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,652Apr 17, 2026Updated last week
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆7,990Jul 11, 2025Updated 9 months ago
- An autonomous agent that conducts deep research on any data using any LLM providers☆26,650Apr 16, 2026Updated last week
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cl…☆30,506Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,941Updated this week
- Web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.☆62,269Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,573Apr 1, 2026Updated 3 weeks ago
- An open-source RAG-based tool for chatting with your documents.☆25,288Apr 3, 2026Updated 3 weeks ago
- Large Language Model Text Generation Inference☆10,843Mar 21, 2026Updated last month
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,838Apr 13, 2026Updated last week
- Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search☆43,854Updated this week