Python package to augment multilingual data
☆15Feb 15, 2023Updated 3 years ago
Alternatives and similar repositories for smaug
Users that are interested in smaug are comparing it to the libraries listed below
Sorting:
- Repository for "BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation", accepted at EAMT 2…☆20Jul 19, 2023Updated 2 years ago
- A repository for experiments in quality-aware decoding☆18Jun 7, 2022Updated 3 years ago
- Introduction and scripts for ACL-2020 paper "On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation"☆21Jun 23, 2020Updated 5 years ago
- A library for minimum Bayes risk (MBR) decoding☆52Nov 2, 2025Updated 4 months ago
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Jul 18, 2023Updated 2 years ago
- ☆38Jun 3, 2021Updated 4 years ago
- Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"☆12Dec 8, 2024Updated last year
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- ☆18Mar 20, 2019Updated 7 years ago
- Project OCELoT: an Open, Collaborative Evaluation Leaderboard of Translations☆23Nov 5, 2025Updated 4 months ago
- ☆33Oct 1, 2021Updated 4 years ago
- Repository for DEMETR: Diagnosing Evaluation Metrics for Translation☆17Nov 29, 2022Updated 3 years ago
- ☆13Mar 25, 2022Updated 3 years ago
- SODA: Story Oriented Dense Video Captioning Evaluation Framework☆14May 3, 2024Updated last year
- ☆20Jan 16, 2024Updated 2 years ago
- A tool that locates, downloads, and extracts machine translation corpora☆163Sep 18, 2025Updated 6 months ago
- {DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}☆14Jun 18, 2023Updated 2 years ago
- ACL Paper Lists(machine translation)☆13Mar 23, 2022Updated 3 years ago
- This code helps to retrieve all papers from conferences and rank them by the number of (Google Scholar) citations.☆12Dec 12, 2021Updated 4 years ago
- BLEURT implementation in PyTorch☆37Jan 19, 2023Updated 3 years ago
- Code and data for the paper "Disentangling Uncertainty in Machine Translation Evaluation", accepted at EMNLP 2022.☆23Jun 23, 2023Updated 2 years ago
- ☆98Sep 25, 2025Updated 5 months ago
- GEMBA — GPT Estimation Metric Based Assessment☆146Dec 15, 2025Updated 3 months ago
- NLP course @ CS Faculty, HSE☆15Mar 4, 2020Updated 6 years ago
- ☆21May 30, 2022Updated 3 years ago
- A library for data streaming and augmentation☆21May 5, 2025Updated 10 months ago
- Scripts to preprocess training and test data and to run fast_align and giza☆107Nov 2, 2021Updated 4 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆58Aug 22, 2020Updated 5 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Apr 20, 2024Updated last year
- This repository is the implementation of "Top-down RST Parsing Utilizing Granularity Levels in Documents" published at AAAI 2020.☆20Dec 14, 2020Updated 5 years ago
- ☆23May 22, 2024Updated last year
- c++ mosestokenizer☆18Mar 13, 2024Updated 2 years ago
- Post-editing Datasets by Rakuten (PEDRa)☆14Jun 23, 2021Updated 4 years ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- Unofficial faiss wheel builder for NVIDIA GPU☆34Mar 8, 2026Updated last week
- ☆23Nov 1, 2022Updated 3 years ago
- ☆16Apr 10, 2024Updated last year
- Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).☆58Apr 4, 2023Updated 2 years ago
- This is a fork of the awesome Joey-NMT with Reinforcement Learning algorithms like Policy Gradient, MRT and Advantage Actor Critic.☆27Feb 10, 2023Updated 3 years ago