A full codebase for replicating the results of Nougat from downloading arXiv dataset to the final evaluation. It also contains a few fixes to the original codebase.
☆11Dec 11, 2023Updated 2 years ago
Alternatives and similar repositories for nougat-replication
Users that are interested in nougat-replication are comparing it to the libraries listed below
Sorting:
- Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…☆11Oct 16, 2022Updated 3 years ago
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆159Sep 25, 2024Updated last year
- Mixture of Experts from scratch☆13Apr 12, 2024Updated last year
- Code and dataset for the EMNLP 2024 paper: GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory☆48Sep 26, 2024Updated last year
- Code for the ACL2023 paper: CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning (https://aclant…☆11May 9, 2023Updated 2 years ago
- Historical shortest-path distance querying index by pruned landmark labeling☆10May 24, 2014Updated 11 years ago
- Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…☆11Oct 18, 2022Updated 3 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 2 years ago
- This is the code repo for Findings of EMNLP2022 paper: MICO: a multi-alternative contrastive learning framework for commonsense knowledg…☆10Nov 29, 2022Updated 3 years ago
- Download all Moodle files with one click. This is a Chrome extension built to save time and effort from downloading files manually one by…☆13Feb 23, 2026Updated last week
- A simple JS script to register desired course when slots are available, for UM-SJTU JI students.☆12May 9, 2022Updated 3 years ago
- Datasets and Evaluation Scripts for CompHRDoc☆56Feb 25, 2025Updated last year
- Source code for the paper 'Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform'.☆12Nov 9, 2022Updated 3 years ago
- Minimal implementation of multiple PEFT methods for LLaMA fine-tuning☆13May 7, 2023Updated 2 years ago
- ☆14Jul 6, 2022Updated 3 years ago
- Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs☆33Dec 9, 2025Updated 2 months ago
- Code for EMNLP 2020 paper: Analogous Process Structure Induction for Sub-event Sequence Prediction☆11Oct 19, 2020Updated 5 years ago
- Code for ACL 2023 paper "A Close Look into the Calibration of Pre-trained Language Models"☆11May 9, 2023Updated 2 years ago
- Official code repository for the paper: AbsPyramid: Benchmarking the Abstration Ability of Language Models with a Unified Entailment Grap…☆13Oct 30, 2024Updated last year
- Data on verb transitivity in English and script to extract transitivity information from Google's syntactic ngrams corpus☆11Oct 1, 2018Updated 7 years ago
- ☆11Oct 12, 2023Updated 2 years ago
- ☆11Nov 29, 2024Updated last year
- ☆16Mar 27, 2023Updated 2 years ago
- This project aims to generate syntactichandwritten mathematical expression. The dataset is generated from the CROHME 2014 training set.☆14Feb 24, 2022Updated 4 years ago
- The corresponding code from our paper "Social Commonsense Reasoning with Multi-Head Knowledge Attention (EMNLP 2020)". Do not hesitate to…☆11Jun 12, 2022Updated 3 years ago
- 🍨 Gelato — From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents☆38Dec 22, 2025Updated 2 months ago
- Code for our work "Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach"☆10Oct 28, 2019Updated 6 years ago
- 给新生用的 Introduction☆14Jun 28, 2025Updated 8 months ago
- What Has Been Enhanced in my Knowledge-Enhanced Language Model?☆13Oct 26, 2022Updated 3 years ago
- Code for EMNLP 2022 Paper DANLI: Deliberative Agent for Following Natural Language Instructions☆18May 1, 2025Updated 10 months ago
- Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"☆15Jan 25, 2024Updated 2 years ago
- A Master Thesis Project on Video Keyword Extractor using Video Summarization techniques.☆11Oct 25, 2020Updated 5 years ago
- Pino configuration for Google Cloud Platform. Enabled structured logging!☆19Feb 17, 2026Updated 2 weeks ago
- Pairwise Ranking Aggregation in a Crowdsourced Setting☆13Apr 13, 2014Updated 11 years ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- PANiC - PAraphrasing Noun-Compounds☆15Apr 6, 2018Updated 7 years ago
- A Modern Text-based User Interface for ChatGPT.☆12Jul 25, 2023Updated 2 years ago
- Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.☆14Jan 23, 2022Updated 4 years ago
- Flux training codes (lora) for UniTEX☆23Jun 8, 2025Updated 8 months ago