facebookresearch / nougatLinks
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆9,761Updated 10 months ago
Alternatives and similar repositories for nougat
Users that are interested in nougat are comparing it to the libraries listed below
Sorting:
- Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022☆6,725Updated last year
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆7,165Updated last year
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,028Updated 2 months ago
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,819Updated last year
- A Repo For Document AI☆3,109Updated last week
- An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them…☆2,735Updated 5 months ago
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,798Updated 10 months ago
- High accuracy RAG for answering questions from scientific documents with citations☆7,932Updated this week
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,472Updated 6 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,041Updated 10 months ago
- Convert PDF to markdown + JSON quickly with high accuracy☆30,547Updated this week
- Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…☆13,472Updated last week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,169Updated 4 months ago
- Improved file parsing for LLM’s☆3,146Updated last year
- Math OCR model that outputs LaTeX and markdown☆1,103Updated 10 months ago
- Structured Outputs☆13,161Updated 2 weeks ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…☆6,093Updated 5 months ago
- ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)☆3,746Updated 2 months ago
- Supercharge Your LLM Application Evaluations 🚀☆11,824Updated this week
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding☆2,265Updated 6 months ago
- LlamaIndex is the leading framework for building LLM-powered agents over your data.☆46,055Updated this week
- A machine learning software for extracting information from scholarly documents☆4,519Updated this week
- A series of large language models trained from scratch by developers @01-ai☆7,849Updated last year
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,332Updated 6 months ago
- Large Language Model Text Generation Inference☆10,711Updated last week
- Fast and memory-efficient exact attention☆21,317Updated this week
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,805Updated 8 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆4,008Updated 11 months ago
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,327Updated last year
- Universal LLM Deployment Engine with ML Compilation☆21,777Updated this week