kreuzberg-dev / kreuzbergView external linksLinks
A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 75+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
β5,943Updated this week
Alternatives and similar repositories for kreuzberg
Users that are interested in kreuzberg are comparing it to the libraries listed below
Sorting:
- Get your documents ready for gen AIβ52,799Updated this week
- A Python tool to visualize + enforce dependencies, using modular architecture π Open source π Installable via pip π§ Able to be adoptedβ¦β2,640Updated this week
- PgQueuer is a Python library leveraging PostgreSQL for efficient job queuing.β1,437Dec 25, 2025Updated last month
- A reactive notebook for Python β run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with giβ¦β19,005Updated this week
- β° Modern datetime library for Pythonβ2,298Feb 6, 2026Updated last week
- A self-hosted API that takes a URL and returns a file with browser screenshots.β1,105Mar 9, 2025Updated 11 months ago
- Python tool for converting files and office documents to Markdown.β86,605Jan 8, 2026Updated last month
- β891May 13, 2025Updated 9 months ago
- π‘ All-in-one AI framework for semantic search, LLM orchestration and language model workflowsβ12,130Updated this week
- WebApps in pure Python. No JavaScript, HTML and CSS neededβ3,349Feb 6, 2026Updated last week
- Toolkit for linearizing PDFs for LLM datasets/trainingβ16,890Updated this week
- OCR & Document Extraction using vision modelsβ12,136May 20, 2025Updated 8 months ago
- A tool for Python developers to easily debug the HTTP(S) client and server requests in a Python program.β892Nov 23, 2025Updated 2 months ago
- πͺ Run Background Tasks at Scaleβ6,521Updated this week
- An intuitive spreadsheet-like interface that lets users of all technical skill levels view, edit, query, and collaborate on Postgres dataβ¦β4,828Updated this week
- Concurrent Python made simpleβ1,519Feb 4, 2025Updated last year
- The most accurate document search and store for building AI appsβ3,471Feb 9, 2026Updated last week
- Convert PDF to markdown + JSON quickly with high accuracyβ31,582Updated this week
- Open-source developer platform to power your entire infra and turn scripts into webhooks, workflows and UIs. Fastest workflow engine (13xβ¦β15,737Feb 9, 2026Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languagesβ19,263Feb 4, 2026Updated last week
- Deep inspection of Python objectsβ1,934Jan 24, 2026Updated 3 weeks ago
- FastOpenAPI is a library for generating and integrating OpenAPI schemas using Pydantic v2 and various frameworks (AioHttp, Django, Falconβ¦β495Updated this week
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the webβ2,334Jun 9, 2025Updated 8 months ago
- An open-source RAG-based tool for chatting with your documents.β25,019Jul 4, 2025Updated 7 months ago
- GenAI Agent Framework, the Pydantic wayβ14,875Updated this week
- Lightpanda: the headless browser designed for AI and automationβ11,824Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing aβ¦β35,968Updated this week
- OpenSource Production ready Customer service with built in Evals and monitoringβ1,435Jan 12, 2026Updated last month
- Create web-based user interfaces with Python. The nice way.β15,343Updated this week
- Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.β6,525Feb 8, 2026Updated last week
- A web framework for building products with Python.β652Updated this week
- Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype, advanced RAG, advanced summaries, scriptable, β¦β508Jan 20, 2026Updated 3 weeks ago
- CrawleeβA web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Dowβ¦β8,047Updated this week
- Build multi-agent systems that learn and improve with every interaction.β37,691Updated this week
- Structured Outputsβ13,403Feb 6, 2026Updated last week
- Lightweight Durable Python Workflowsβ1,179Updated this week
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.β7,673Nov 7, 2025Updated 3 months ago
- πΈοΈ Web apps in pure Python πβ28,095Updated this week
- Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control fullβ¦β12,533Updated this week