clovaai / donutLinks

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

☆6,444

Alternatives and similar repositories for donut

Users that are interested in donut are comparing it to the libraries listed below

Sorting:

deepdoctection / deepdoctection
A Repo For Document AI
☆2,899Updated last week
mindee / doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
☆5,011Updated last week
microsoft / table-transformer
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…
☆2,682Updated last year
impira / docquery
An easy way to extract information from documents
☆1,772Updated 2 years ago
facebookresearch / nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆9,546Updated 5 months ago
AlibabaResearch / AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…
☆1,749Updated 3 months ago
tstanislawek / awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
☆1,442Updated 2 years ago
open-mmlab / mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
☆4,580Updated 8 months ago
Layout-Parser / layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
☆5,384Updated 11 months ago
1rgs / jsonformer
A Bulletproof Way to Generate Structured JSON from Language Models
☆4,776Updated last year
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,079Updated 3 weeks ago
huggingface / text-generation-inference
Large Language Model Text Generation Inference
☆10,367Updated last week
eyurtsev / kor
LLM(😽)
☆1,681Updated 5 months ago
Filimoa / open-parse
Improved file parsing for LLM’s
☆3,023Updated 8 months ago
Dicklesworthstone / llm_aided_ocr
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
☆2,714Updated 5 months ago
microsoft / unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆21,572Updated 3 weeks ago
refuel-ai / autolabel
Label, clean and enrich text datasets with LLMs.
☆2,249Updated 4 months ago
ray-project / llm-numbers
Numbers every LLM developer should know
☆4,248Updated last year
ibm-aur-nlp / PubLayNet
☆999Updated 3 weeks ago
xlang-ai / instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
☆1,990Updated 6 months ago
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,583Updated last year
dottxt-ai / outlines
Structured Outputs
☆12,149Updated last week
openlm-research / open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,514Updated 2 years ago
imaurer / awesome-llm-json
Resource list for generating JSON using LLMs via function calling, tools, CFG. Libraries, Models, Notebooks, etc.
☆2,127Updated 5 months ago
Unstructured-IO / unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…
☆12,035Updated this week
google-research / pix2struct
☆653Updated last month
zacharywhitley / awesome-ocr
☆968Updated 10 months ago
philschmid / document-ai-transformers
☆372Updated last year
opendatalab / DocLayout-YOLO
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆1,473Updated 3 months ago
promptslab / Promptify
Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engin…
☆4,003Updated 5 months ago