Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
☆14,700May 13, 2026Updated this week
Alternatives and similar repositories for unstructured
Users that are interested in unstructured are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LlamaIndex is the leading document agent and OCR platform☆49,354Updated this week
- DSPy: The framework for programming—not prompting—language models☆34,327May 7, 2026Updated last week
- Supercharge Your LLM Application Evaluations 🚀☆13,896Feb 24, 2026Updated 2 months ago
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆25,140May 8, 2026Updated last week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆46,789Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆32,860Updated this week
- structured outputs for llms☆12,945Updated this week
- The agent engineering platform. Available in TypeScript!☆136,191Updated this week
- Universal memory layer for AI Agents☆55,385Updated this week
- A programming framework for agentic AI☆58,015Apr 15, 2026Updated last month
- Convert PDF to markdown + JSON quickly with high accuracy☆34,893May 5, 2026Updated last week
- Structured Outputs☆13,825May 4, 2026Updated last week
- Get your documents ready for gen AI☆59,522Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆79,733Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,478Updated this week
- Search infrastructure for AI☆27,892Updated this week
- A guidance language for controlling large language models.☆21,448May 6, 2026Updated last week
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆51,228Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.☆22,557Apr 12, 2026Updated last month
- Build, run, and manage agent platforms.☆40,013Updated this week
- Knowledge Agents and Management in the Cloud☆4,251May 4, 2026Updated last week
- Build AI Agents, Visually☆52,673Updated this week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆27,172Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Build Conversational AI in minutes ⚡️☆12,061Apr 24, 2026Updated 3 weeks ago
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆79,944May 8, 2026Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,730May 6, 2026Updated last week
- ☆920Updated this week
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with struc…☆16,177Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,690May 4, 2026Updated last week
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆8,021Jul 11, 2025Updated 10 months ago
- An autonomous agent that conducts deep research on any data using any LLM providers☆26,934Apr 16, 2026Updated 3 weeks ago
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cl…☆31,239Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,968Apr 27, 2026Updated 2 weeks ago
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.☆63,952Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,658Apr 22, 2026Updated 3 weeks ago
- An open-source RAG-based tool for chatting with your documents.☆25,367Apr 3, 2026Updated last month
- Large Language Model Text Generation Inference☆10,854Mar 21, 2026Updated last month
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,861Apr 13, 2026Updated last month
- Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search☆44,195Updated this week