Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
☆14,808May 23, 2026Updated last week
Alternatives and similar repositories for unstructured
Users that are interested in unstructured are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LlamaIndex is the leading document agent and OCR platform☆49,695May 26, 2026Updated last week
- DSPy: The framework for programming—not prompting—language models☆34,811Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆14,123Feb 24, 2026Updated 3 months ago
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆25,437Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆48,644Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆33,363May 28, 2026Updated last week
- structured outputs for llms☆13,023May 24, 2026Updated last week
- The agent engineering platform.☆138,156Updated this week
- A programming framework for agentic AI☆58,533Apr 15, 2026Updated last month
- Universal memory layer for AI Agents☆56,739May 26, 2026Updated last week
- Convert PDF to markdown + JSON quickly with high accuracy☆35,659May 5, 2026Updated 3 weeks ago
- Structured Outputs☆13,920May 18, 2026Updated 2 weeks ago
- Get your documents ready for gen AI☆60,897Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆81,099May 27, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,622Updated this week
- Search infrastructure for AI☆28,181Updated this week
- A guidance language for controlling large language models.☆21,485May 21, 2026Updated 2 weeks ago
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.☆23,054May 14, 2026Updated 3 weeks ago
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆52,280May 27, 2026Updated last week
- Build, run, and manage agent platforms.☆40,433Updated this week
- Knowledge Agents and Management in the Cloud☆4,251May 18, 2026Updated 2 weeks ago
- Build AI Agents, Visually☆53,245Updated this week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆28,205Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Build Conversational AI in minutes ⚡️☆12,157May 26, 2026Updated last week
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆81,546Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,787May 27, 2026Updated last week
- ☆931May 27, 2026Updated last week
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with struc…☆16,257Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,715May 11, 2026Updated 3 weeks ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆8,049Jul 11, 2025Updated 10 months ago
- An autonomous agent that conducts deep research on any data using any LLM providers☆27,403May 28, 2026Updated last week
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cl…☆31,578May 26, 2026Updated last week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,985May 25, 2026Updated last week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.☆65,620Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,722Apr 22, 2026Updated last month
- An open-source RAG-based tool for chatting with your documents.☆25,415Updated this week
- Large Language Model Text Generation Inference☆10,857Mar 21, 2026Updated 2 months ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,878Apr 13, 2026Updated last month
- Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search☆44,567Updated this week