Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
☆14,964Jun 18, 2026Updated this week
Alternatives and similar repositories for unstructured
Users that are interested in unstructured are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LlamaIndex is the leading document agent and OCR platform☆50,188Jun 17, 2026Updated last week
- DSPy: The framework for programming—not prompting—language models☆35,310Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆14,430Feb 24, 2026Updated 4 months ago
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆25,622Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆50,785Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆33,885Updated this week
- structured outputs for llms☆13,210Updated this week
- The agent engineering platform.☆139,780Updated this week
- A programming framework for agentic AI☆59,069Apr 15, 2026Updated 2 months ago
- Universal memory layer for AI Agents☆59,199Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆36,284Jun 6, 2026Updated 2 weeks ago
- Structured Outputs☆13,984Updated this week
- Get your documents ready for gen AI☆62,000Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆83,135Jun 17, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,673Updated this week
- Search infrastructure for AI☆28,526Updated this week
- A guidance language for controlling large language models.☆21,507May 21, 2026Updated last month
- Platform for stateful agents: AI with advanced memory that can learn and self-improve over time.☆23,435May 14, 2026Updated last month
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆53,781Jun 17, 2026Updated last week
- Build, run, and manage agent platforms.☆40,783Updated this week
- Knowledge Agents and Management in the Cloud☆4,250May 18, 2026Updated last month
- Build AI Agents, Visually☆53,860Jun 16, 2026Updated last week
- 🪢 Open source AI engineering platform: LLM evals, observability, metrics, prompt management, playground, datasets. Integrates with OpenT…☆29,372Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Build Conversational AI in minutes ⚡️☆12,229Jun 11, 2026Updated last week
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆83,205Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆20,840Jun 13, 2026Updated last week
- ☆934May 27, 2026Updated 3 weeks ago
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with struc…☆16,348Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,713Jun 8, 2026Updated 2 weeks ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆8,073Jul 11, 2025Updated 11 months ago
- An autonomous agent that conducts deep research on any data using any LLM providers☆27,799May 28, 2026Updated 3 weeks ago
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cl…☆32,575Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆5,014Updated this week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.☆67,133Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,852Apr 22, 2026Updated 2 months ago
- An open-source RAG-based tool for chatting with your documents.☆25,478Jun 9, 2026Updated 2 weeks ago
- Large Language Model Text Generation Inference☆10,863Mar 21, 2026Updated 3 months ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,912Apr 13, 2026Updated 2 months ago
- Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search☆44,862Updated this week