Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML pars…
☆701May 4, 2026Updated 3 weeks ago
Alternatives and similar repositories for dedoc
Users that are interested in dedoc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- YSC 2023 Papers: A complete collection of research papers, code and data from the International Young Scientists Conference 2023 for youn…☆12Jan 17, 2024Updated 2 years ago
- This is an unofficial ITMO beamer template made by me. Please, feel free to use it and contribute.☆15Oct 10, 2023Updated 2 years ago
- Convert any PDF into it's LaTeX source☆18May 15, 2025Updated last year
- Handwritten Text Generation☆17Oct 17, 2022Updated 3 years ago
- Framework for the automatic creation of CNN architectures☆38Nov 21, 2025Updated 6 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Markdown Conversion☆377Jun 7, 2025Updated 11 months ago
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆1,142May 6, 2026Updated 2 weeks ago
- python package to parse pdfs with different parsers☆268Sep 12, 2025Updated 8 months ago
- Effective LLM Alignment Toolkit☆153Jun 25, 2025Updated 11 months ago
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆2,018Mar 17, 2026Updated 2 months ago
- The tiniest sentence encoder for Russian language☆245Jul 25, 2024Updated last year
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆63Oct 7, 2024Updated last year
- ContextGem: Effortless LLM extraction from documents☆1,844May 7, 2026Updated 2 weeks ago
- Lomonosov Moscow State University (MSU) LaTeX Thesis Template☆22Jul 5, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Meeting summary from meeting transcript using LLM via OpenAI-like completion API☆45Mar 30, 2026Updated last month
- Lightweight RTSP/MJPEG stream health monitor for heterogeneous camera networks.☆38Apr 5, 2026Updated last month
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,756May 6, 2026Updated 2 weeks ago
- Sample agents for Enterprise RAG Challenge 3: AI Agents in Action☆47Dec 9, 2025Updated 5 months ago
- ⚡ Набор решений для разработки LLM-приложений на русском языке с поддержкой GigaChat ⚡☆563Updated this week
- ☆209Apr 29, 2026Updated 3 weeks ago
- Репозиторий измеряет качество Yandexgpt, Gigachat, T-Pro, Saiga, Vikhr, Ruadapt на популярных англоязычных бенчмарках: MGSM, MATH, HumanE…☆24Apr 16, 2025Updated last year
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆320Aug 15, 2025Updated 9 months ago
- Augmentex — a library for augmenting texts with errors☆69Jul 3, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Package for word stress detection☆11Jan 27, 2023Updated 3 years ago
- The realisations and examples of algorithms from FEDOT framework☆12May 27, 2021Updated 4 years ago
- The repository contains the data of the NSS lab team for the hackathon "AgroCode Hack 2022". The task was to analyze cow treatment data a…☆11Sep 19, 2022Updated 3 years ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,544Aug 27, 2025Updated 8 months ago
- ☆17May 22, 2025Updated last year
- Schema-Guided Reasoning (SGR) has agentic system design created by neuraldeep community☆1,089May 7, 2026Updated 2 weeks ago
- Simple package to extract text with coordinates from programmatic PDFs☆277Updated this week
- coze api to openai☆15Sep 1, 2024Updated last year
- 📥 cpdown - Copy to clipboard any webpage content/youtube subtitle as clean markdown with one click or shortcut☆550Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11May 22, 2024Updated 2 years ago
- Код для файнтюна LM (rugpt, LLaMa, FRED T5) средствами transformers + deepspeed + LoRa☆14May 22, 2023Updated 3 years ago
- ☆11Dec 11, 2024Updated last year
- The toolbox for the automated creation of the SWAN wind wave model configurations☆13Sep 20, 2023Updated 2 years ago
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)☆683May 13, 2026Updated last week
- Generation of handwritten cyrillic text using fonts☆13Mar 27, 2023Updated 3 years ago
- Parser for Ars Electronica Archive: https://archive.aec.at/prix/☆14Feb 18, 2023Updated 3 years ago