Python package to augment multilingual data
☆15Feb 15, 2023Updated 3 years ago
Alternatives and similar repositories for smaug
Users that are interested in smaug are comparing it to the libraries listed below
Sorting:
- A repository for experiments in quality-aware decoding☆18Jun 7, 2022Updated 3 years ago
- Introduction and scripts for ACL-2020 paper "On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation"☆21Jun 23, 2020Updated 5 years ago
- Project OCELoT: an Open, Collaborative Evaluation Leaderboard of Translations☆23Nov 5, 2025Updated 3 months ago
- ☆38Jun 3, 2021Updated 4 years ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency …☆10Jul 18, 2023Updated 2 years ago
- Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"☆12Dec 8, 2024Updated last year
- ACL Paper Lists(machine translation)☆13Mar 23, 2022Updated 3 years ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- This code helps to retrieve all papers from conferences and rank them by the number of (Google Scholar) citations.☆12Dec 12, 2021Updated 4 years ago
- Post-editing Datasets by Rakuten (PEDRa)☆14Jun 23, 2021Updated 4 years ago
- {DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}☆14Jun 18, 2023Updated 2 years ago
- Repository for "BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation", accepted at EAMT 2…☆20Jul 19, 2023Updated 2 years ago
- A library for data streaming and augmentation☆21May 5, 2025Updated 9 months ago
- Generate nice CLI from a function signature.☆18Apr 25, 2023Updated 2 years ago
- c++ mosestokenizer☆18Mar 13, 2024Updated last year
- ☆33Oct 1, 2021Updated 4 years ago
- ChatGPT plugin for Singapore HDB car park availability☆19Jun 7, 2023Updated 2 years ago
- Implementation of our paper in EMNLP 2022, focused on the relationship between parent and child in transfer learning for low-resourc…☆17Dec 7, 2022Updated 3 years ago
- ☆21May 30, 2022Updated 3 years ago
- Java library to tokenize Thai text into a list of TCCs☆19May 30, 2017Updated 8 years ago
- GEMBA — GPT Estimation Metric Based Assessment☆146Dec 15, 2025Updated 2 months ago
- A library for minimum Bayes risk (MBR) decoding☆51Nov 2, 2025Updated 3 months ago
- Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).☆58Apr 4, 2023Updated 2 years ago
- Unsupervised multilingual sentence segmentation.☆21Feb 26, 2021Updated 5 years ago
- ☆98Sep 25, 2025Updated 5 months ago
- ☆20Jan 16, 2024Updated 2 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Apr 20, 2024Updated last year
- ☆23Nov 15, 2022Updated 3 years ago
- Codenize your datasources.☆27Dec 1, 2024Updated last year
- MT Evaluation in Many Languages via Zero-Shot Paraphrasing☆102Jul 25, 2024Updated last year
- A tool that locates, downloads, and extracts machine translation corpora☆162Sep 18, 2025Updated 5 months ago
- This is a fork of the awesome Joey-NMT with Reinforcement Learning algorithms like Policy Gradient, MRT and Advantage Actor Critic.☆27Feb 10, 2023Updated 3 years ago
- A retrieval augmented sequence modeling toolkit implemented based on Fairseq☆29Mar 3, 2023Updated 2 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆34Sep 4, 2025Updated 5 months ago
- A repo for sharing language resources related to the outbreak (in machine readable format)☆25Sep 22, 2025Updated 5 months ago
- Scripts to preprocess training and test data and to run fast_align and giza☆107Nov 2, 2021Updated 4 years ago
- Tools for formatting WMT hypothesis and test sets in XML☆27Apr 18, 2025Updated 10 months ago
- Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation (AAAI 2021)☆25Jun 18, 2022Updated 3 years ago