lilingxi01/nougat-replication

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lilingxi01/nougat-replication)

lilingxi01 / nougat-replication

A full codebase for replicating the results of Nougat from downloading arXiv dataset to the final evaluation. It also contains a few fixes to the original codebase.

☆11

Alternatives and similar repositories for nougat-replication

Users that are interested in nougat-replication are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SII-sc22mc / DocFusion
View on GitHub
A Unified Framework for Document Parsing Tasks (Including Document Layout Analysis, OCR, Formula Recognition, and Table Recognition)
☆15Jul 1, 2025Updated last year
HKUST-KnowComp / SubeventWriter
View on GitHub
Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…
☆11Oct 16, 2022Updated 3 years ago
antonio-f / mixture-of-experts-from-scratch
View on GitHub
Mixture of Experts from scratch
☆14Apr 12, 2024Updated 2 years ago
simenandre / pino-cloud-logging
View on GitHub
Pino configuration for Google Cloud Platform. Enabled structured logging!
☆19Updated this week
NormXU / nougat-latex-ocr
View on GitHub
Codebase for fine-tuning / evaluating nougat-based image2latex generation models
☆160Sep 25, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
bernhardschaefer / handwritten-diagram-datasets
View on GitHub
☆20Sep 1, 2022Updated 3 years ago
tomups / nextjs-pino-http
View on GitHub
Patched Next.js to have full logs via pino
☆15Mar 19, 2024Updated 2 years ago
samhita-alla / geolocator
View on GitHub
Location Predictor 📍
☆16Jul 13, 2026Updated last week
microsoft / CompHRDoc
View on GitHub
Datasets and Evaluation Scripts for CompHRDoc
☆59Feb 25, 2025Updated last year
mao-yuwei / paper_download
View on GitHub
download html paper to word format
☆16Jun 30, 2026Updated 3 weeks ago
dhlab-epfl / dhSegment-text
View on GitHub
Fork of dhSegment for experiments on visual and textual feature combination.
☆15Jan 30, 2021Updated 5 years ago
khadkechetan / information_extraction
View on GitHub
☆11Nov 29, 2024Updated last year
liyunfei0411 / labelimg-master
View on GitHub
☆14Apr 21, 2026Updated 3 months ago
HKUST-KnowComp / PseudoReasoner
View on GitHub
Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…
☆11Oct 18, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HKUST-KnowComp / AbsPyramid
View on GitHub
Official code repository for the paper: AbsPyramid: Benchmarking the Abstration Ability of Language Models with a Unified Entailment Grap…
☆13Oct 30, 2024Updated last year
ResponsibleAILab / GenFlowchart
View on GitHub
GenFlowChart is a framework that implements flowchart parsing using generative AI. Leveraging SAM for segmentation and OCR for text extra…
☆35Jun 5, 2024Updated 2 years ago
CogComp / APSI
View on GitHub
Code for EMNLP 2020 paper: Analogous Process Structure Induction for Sub-event Sequence Prediction
☆11Oct 19, 2020Updated 5 years ago
HKUST-KnowComp / MICO
View on GitHub
This is the code repo for Findings of EMNLP2022 paper: MICO: a multi-alternative contrastive learning framework for commonsense knowledg…
☆10Nov 29, 2022Updated 3 years ago
lilingxi01 / ams-date-picker
View on GitHub
(WIP) Ams Date Picker - A modern, magical, and unstyled date picker for React. We have your favorite Time Machine and Input Supercharge o…
☆32Feb 17, 2023Updated 3 years ago
vl-illusion / GVIL
View on GitHub
Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"
☆15Jan 25, 2024Updated 2 years ago
aorwall / moatless-testbeds
View on GitHub
Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git…
☆14Apr 9, 2025Updated last year
Jiayi-Pan / ChatGPT_TUI
View on GitHub
A Modern Text-based User Interface for ChatGPT.
☆13Jul 25, 2023Updated 2 years ago
Jiayi-Pan / UMJILuckyDraw
View on GitHub
A simple JS script to register desired course when slots are available, for UM-SJTU JI students.
☆13May 9, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
OBI-Future / OBI-Survey
View on GitHub
[npj Heritage Science'26] The official GitHub page for the survey paper "Oracle Bone Inscriptions Processing: A Comprehensive Survey".
☆18Jun 30, 2026Updated 3 weeks ago
HKUST-KnowComp / CAT
View on GitHub
Code for the ACL2023 paper: CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning (https://aclant…
☆11May 9, 2023Updated 3 years ago
ekomanurung / spring-boot-web-kafka-producer-mongoDB
View on GitHub
Simple Inventory CRUD Application using spring-boot - kafka - mongoDB
☆16Oct 24, 2024Updated last year
HKUST-KnowComp / ComplexHyperbolicKGE
View on GitHub
Source code for the paper 'Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform'.
☆12Nov 9, 2022Updated 3 years ago
lifan-yuan / PLMCalibration
View on GitHub
Code for ACL 2023 paper "A Close Look into the Calibration of Pre-trained Language Models"
☆11May 9, 2023Updated 3 years ago
vered1986 / panic
View on GitHub
PANiC - PAraphrasing Noun-Compounds
☆15Apr 6, 2018Updated 8 years ago
imagination-research / EEP
View on GitHub
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
☆25Nov 11, 2025Updated 8 months ago
tqfang / comet-deepspeed
View on GitHub
Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.
☆14Jan 23, 2022Updated 4 years ago
da03 / criticize_text_generation
View on GitHub
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆12Mar 18, 2023Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
doc-analysis / DocBankLoader
View on GitHub
DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.
☆24Mar 17, 2021Updated 5 years ago
Heidelberg-NLP / MHKA
View on GitHub
The corresponding code from our paper "Social Commonsense Reasoning with Multi-Head Knowledge Attention (EMNLP 2020)". Do not hesitate to…
☆11Jun 12, 2022Updated 4 years ago
inuwamobarak / nougat
View on GitHub
Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.
☆28Oct 14, 2023Updated 2 years ago
rajeev595 / RHS_HierNSE
View on GitHub
Code for our work "Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach"
☆10Oct 28, 2019Updated 6 years ago
iwiwi / historical-pruned-landmark-labeling
View on GitHub
Historical shortest-path distance querying index by pruned landmark labeling
☆10May 24, 2014Updated 12 years ago
shizuo-kaji / PairedImageTranslation
View on GitHub
Image translation for paired image datasets (AUTOMAP + Pix2pix)
☆21Aug 26, 2021Updated 4 years ago
lancopku / MUKI
View on GitHub
[Findings of EMNLP22] From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models
☆19Mar 16, 2023Updated 3 years ago