drmingler / docling-apiLinks
Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
☆622Updated 3 months ago
Alternatives and similar repositories for docling-api
Users that are interested in docling-api are comparing it to the libraries listed below
Sorting:
- Running Docling as an API service☆479Updated this week
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard though☆553Updated 3 weeks ago
- 🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL☆1,021Updated last week
- ContextGem: Effortless LLM extraction from documents☆1,191Updated this week
- Serverless Modal + FastAPI + React + ColPali + Qdrant + GPT4o Vision RAG (V-RAG) Demo☆382Updated 7 months ago
- Parse PDFs into markdown using Vision LLMs☆392Updated 4 months ago
- OCR Benchmark☆511Updated 3 weeks ago
- 🦛 CHONK your texts with Chonkie ✨ — The no-nonsense RAG chunking library☆1,538Updated this week
- 📄 🧠 PageIndex: Document Index System for Reasoning-based RAG☆1,066Updated last week
- A Chrome extension for asking questions over websites☆341Updated 4 months ago
- ✨ AI interface for tinkerers (Ollama, Haystack RAG, Python)☆461Updated last week
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,284Updated 2 weeks ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,138Updated last month
- Make any LLM to think like OpenAI o1 and deepseek R1☆490Updated 4 months ago
- A Kubernetes deployable instance of GroundX for document parsing, storage, and search.☆758Updated this week
- AI-first Search & Answer Engine for work. Open-source alternative to Glean.☆544Updated this week
- A list of useful Open Source tools and scrapers to gather data for LLMs☆238Updated 4 months ago
- The open-source multi-agent chat interface that lets you manage multiple agents in one dynamic conversation and add MCP servers for deep …☆408Updated 2 months ago
- Sample apps to help developers get started with Structured Outputs☆643Updated 5 months ago
- Reasoning Augmented Generation☆855Updated 4 months ago
- Deep Research for your internal data☆327Updated 2 weeks ago
- Deploy intelligence to your agents. Connect agents to graph-based intelligence automatically built from raw data. Build, ship, and manag…☆448Updated this week
- ☆1,567Updated 3 months ago
- Real Time Speech Transcription with FastRTC ⚡️and Local Whisper 🤗☆661Updated last week
- A simple Python program to implement the search-extract-summarize flow.☆269Updated last week
- TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by inf…☆189Updated 3 weeks ago
- Turn topics into essays in seconds!☆184Updated 2 months ago
- An opensource implementation of NotebookLM using Deepseek-V3 and PlayHT TTS.☆275Updated 5 months ago
- A minimal, open-source setup for serving Agents using FastAPI and Postgres. Built for speed, clarity, and dev happiness.☆250Updated last month
- A powerful Python tool for performing technical searches using the Perplexity API, optimized for retrieving precise facts, code examples,…☆206Updated 5 months ago