prajdabre / yanmtt
Yet Another Neural Machine Translation Toolkit
☆173Updated 2 months ago
Related projects: ⓘ
- A tool that locates, downloads, and extracts machine translation corpora☆145Updated 3 months ago
- A neural word aligner based on multilingual BERT☆321Updated 2 years ago
- MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance☆192Updated 10 months ago
- Repository to collect and categorize Grammatical Error Correction papers.☆112Updated 4 months ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆85Updated last month
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆118Updated 2 years ago
- Easier Automatic Sentence Simplification Evaluation☆157Updated 11 months ago
- a tool for calcualting character n-gram F score☆65Updated last year
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆280Updated last year
- A repository with the code related to experiments around context-aware machine translation☆48Updated 2 years ago
- Transformer based translation quality estimation☆106Updated last year
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆152Updated 2 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆347Updated 10 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆148Updated 3 months ago
- This repo supports various cross-lingual transfer learning & multilingual NLP models.☆91Updated last year
- DialogSum: A Real-life Scenario Dialogue Summarization Dataset - Findings of ACL 2021☆170Updated last year
- MT Evaluation in Many Languages via Zero-Shot Paraphrasing☆101Updated last month
- cLang-8 is a dataset for grammatical error correction.☆102Updated 2 years ago
- Python library & examples for Masked Language Model Scoring (ACL 2020)☆333Updated last year
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆188Updated 3 years ago
- TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and …☆289Updated 4 years ago
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.☆135Updated last year
- ☆175Updated 2 years ago
- ☆226Updated 3 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆96Updated 5 months ago
- A simple library for querying the URIEL typological database.☆87Updated 5 months ago
- How to finetune mbart using fairseq☆20Updated 3 years ago
- Reduce the size of pretrained Hugging Face models via vocabulary trimming.☆39Updated last year
- Easily fine tune GPT-2 to fill in missing text☆197Updated last year
- The official tool for creating proceedings for conferences of the Association for Computational Linguistics (ACL).☆217Updated last month