Simple package to extract text with coordinates from programmatic PDFs
☆305Jun 1, 2026Updated 2 weeks ago
Alternatives and similar repositories for docling-parse
Users that are interested in docling-parse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆206Jun 4, 2026Updated last week
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆59Jan 27, 2025Updated last year
- Docling core data types and transformations☆259Updated this week
- A set of tools to create synthetically-generated data from documents☆48Aug 15, 2025Updated 10 months ago
- Running Docling as an API service☆1,593Updated this week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆32Updated this week
- Interact with the Deep Search platform for new knowledge explorations and discoveries☆226Jan 24, 2025Updated last year
- ☆22Feb 1, 2025Updated last year
- Use Docling output in TypeScript and JavaScript☆34May 20, 2025Updated last year
- Making docling agentic through MCP☆653Updated this week
- Examples using the Deep Search functionalities☆87Jan 29, 2025Updated last year
- Per-collection OCR leaderboards using VLM-as-judge☆59Jun 2, 2026Updated last week
- Build document-native LLM applications☆58Sep 11, 2024Updated last year
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Poor man's simple harvester for arXiv resources☆14Jul 14, 2023Updated 2 years ago
- Evaluation framework for document processing models and services.☆75May 28, 2026Updated 2 weeks ago
- Get your documents ready for gen AI☆61,291Updated this week
- Train, finetune and interact with a foundation model for the electric power grid.☆81Updated this week
- LoRA supervised fine-tuning, RLHF (PPO) and RAG with llama-3-8B on the TLDR summarization dataset☆14Feb 2, 2025Updated last year
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆19Dec 4, 2023Updated 2 years ago
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆19Jun 16, 2023Updated 2 years ago
- Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files☆154Sep 17, 2025Updated 8 months ago
- Repository hosting the common code for the entity-fishing clients☆10May 18, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆17Mar 22, 2024Updated 2 years ago
- Extract structured text from pdfs quickly☆695Updated this week
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆436Feb 1, 2023Updated 3 years ago
- The IBM Developer Workshop Template (https://ibm.github.io/workshop-template/)☆13Sep 17, 2025Updated 8 months ago
- Open source project for data preparation for GenAI applications☆940Jun 4, 2026Updated last week
- A Knowledge Base for research software relying on large-scale text mining and curated knowledge sources☆18May 14, 2023Updated 3 years ago
- [ICCV 23] MolGrapher: Graph-based Visual Recognition of Chemical Structures☆96Nov 18, 2025Updated 6 months ago
- A Unified Toolkit for Deep Learning-Based Table Extraction☆59Nov 21, 2024Updated last year
- 📚 Process PDFs, Word documents and more with spaCy☆903Mar 27, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆321Aug 15, 2025Updated 10 months ago
- Docling4j brings the functionalities of Docling in document understanding to Java® projects☆28Mar 31, 2025Updated last year
- Open Source AI with Granite and Granite Code☆27Oct 6, 2025Updated 8 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆195May 31, 2024Updated 2 years ago
- Open Access PDF harvester☆42May 3, 2024Updated 2 years ago
- ☆209Jun 9, 2026Updated last week
- Open Access PDF harvester, metadata aggregator and full-text ingester☆62May 3, 2024Updated 2 years ago