Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
☆14,841Jun 6, 2026Updated this week
Alternatives and similar repositories for unstructured
Users that are interested in unstructured are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LlamaIndex is the leading document agent and OCR platform☆49,909Updated this week
- DSPy: The framework for programming—not prompting—language models☆34,811Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆14,252Feb 24, 2026Updated 3 months ago
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆25,437Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆49,384Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆33,363May 28, 2026Updated last week
- structured outputs for llms☆13,097Updated this week
- The agent engineering platform.☆138,156Updated this week
- A programming framework for agentic AI☆58,726Apr 15, 2026Updated last month
- Universal memory layer for AI Agents☆57,641Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆35,659May 5, 2026Updated last month
- Structured Outputs☆13,920May 18, 2026Updated 2 weeks ago
- Get your documents ready for gen AI☆60,897Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆81,909Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,622Updated this week
- Search infrastructure for AI☆28,181Updated this week
- A guidance language for controlling large language models.☆21,485May 21, 2026Updated 2 weeks ago
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.☆23,054May 14, 2026Updated 3 weeks ago
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆52,820Updated this week
- Build, run, and manage agent platforms.☆40,433May 30, 2026Updated last week
- Knowledge Agents and Management in the Cloud☆4,249May 18, 2026Updated 2 weeks ago
- Build AI Agents, Visually☆53,245May 30, 2026Updated last week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆28,569Updated this week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Build Conversational AI in minutes ⚡️☆12,157May 26, 2026Updated last week
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆81,546May 29, 2026Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆20,618Updated this week
- ☆933May 27, 2026Updated last week
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with struc…☆16,280Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,715May 11, 2026Updated 3 weeks ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆8,049Jul 11, 2025Updated 10 months ago
- An autonomous agent that conducts deep research on any data using any LLM providers☆27,403May 28, 2026Updated last week
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cl…☆31,788Updated this week
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,995Updated this week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.☆65,620Updated this week
- Retrieval and Retrieval-augmented LLMs☆11,770Apr 22, 2026Updated last month
- An open-source RAG-based tool for chatting with your documents.☆25,415May 31, 2026Updated last week
- Large Language Model Text Generation Inference☆10,857Mar 21, 2026Updated 2 months ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,878Apr 13, 2026Updated last month
- Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search☆44,567Updated this week