Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML pars…
☆650Mar 19, 2026Updated this week
Alternatives and similar repositories for dedoc
Users that are interested in dedoc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Synthetic Document Generator for Document AI. Creates document images annotated with text and bounding boxes of each word. Images contain…☆29Jul 23, 2025Updated 8 months ago
- YSC 2023 Papers: A complete collection of research papers, code and data from the International Young Scientists Conference 2023 for youn…☆12Jan 17, 2024Updated 2 years ago
- Convert any PDF into it's LaTeX source☆18May 15, 2025Updated 10 months ago
- Source code for https://t.me/science_art_at_least_once_a_week channel☆16Jun 15, 2024Updated last year
- Handwritten Text Generation☆17Oct 17, 2022Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Framework for the automatic creation of CNN architectures☆38Nov 21, 2025Updated 4 months ago
- Markdown Conversion☆374Jun 7, 2025Updated 9 months ago
- 🇷🇺 Punctuation restoration production-ready model for Russian language 🇷🇺☆59Jul 9, 2021Updated 4 years ago
- python package to parse pdfs with different parsers☆249Sep 12, 2025Updated 6 months ago
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆1,099Mar 2, 2026Updated 3 weeks ago
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,877Mar 17, 2026Updated last week
- Effective LLM Alignment Toolkit☆152Jun 25, 2025Updated 9 months ago
- Text reading pipeline that combines segmentation and OCR-models.☆26Feb 6, 2023Updated 3 years ago
- The tiniest sentence encoder for Russian language☆246Jul 25, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆16Jan 19, 2023Updated 3 years ago
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆63Oct 7, 2024Updated last year
- ContextGem: Effortless LLM extraction from documents☆1,815Mar 16, 2026Updated last week
- Code for paper "Short-term River Flood Forecasting using Composite Models and Automated Machine Learning: the Case Study of Lena River"☆12Dec 9, 2021Updated 4 years ago
- Your powerful publishing rich text editor☆49Aug 31, 2025Updated 6 months ago
- Meeting summary from meeting transcript using LLM via OpenAI-like completion API☆43Dec 2, 2025Updated 3 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,477Mar 1, 2026Updated 3 weeks ago
- ☆207Updated this week
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆317Aug 15, 2025Updated 7 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆35Sep 21, 2023Updated 2 years ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,501Aug 27, 2025Updated 6 months ago
- The realisations and examples of algorithms from FEDOT framework☆12May 27, 2021Updated 4 years ago
- The repository contains the data of the NSS lab team for the hackathon "AgroCode Hack 2022". The task was to analyze cow treatment data a…☆11Sep 19, 2022Updated 3 years ago
- Schema-Guided Reasoning (SGR) has agentic system design created by neuraldeep community☆1,045Mar 18, 2026Updated last week
- ☆14May 22, 2025Updated 10 months ago
- coze api to openai☆15Sep 1, 2024Updated last year
- Код для файнтюна LM (rugpt, LLaMa, FRED T5) средствами transformers + deepspeed + LoRa☆14May 22, 2023Updated 2 years ago
- The toolbox for the automated creation of the SWAN wind wave model configurations☆13Sep 20, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)☆685May 20, 2025Updated 10 months ago
- Create realistic looking handwritten text PDFs from text files.☆15Jun 19, 2021Updated 4 years ago
- Generation of handwritten cyrillic text using fonts☆12Mar 27, 2023Updated 2 years ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,484Jan 3, 2025Updated last year
- Open-source framework for adaptive manufacturing processes scheduling☆34Feb 19, 2026Updated last month
- 🚀全新重构!论文阅读工具,一键截图AI翻译,支持数学公式,贴片截图,窗口锁定,归档管理☆145Mar 14, 2026Updated last week
- SAGE: Spelling correction, corruption and evaluation for multiple languages☆164Dec 8, 2025Updated 3 months ago