openlanguagedata / floresView external linksLinks
The FLORES+ Machine Translation Benchmark
☆110Nov 12, 2024Updated last year
Alternatives and similar repositories for flores
Users that are interested in flores are comparing it to the libraries listed below
Sorting:
- Seed Machine Translation Data☆33Nov 12, 2024Updated last year
- NTREX -- News Test References for MT Evaluation☆88Jun 5, 2024Updated last year
- Facebook Low Resource (FLoRes) MT Benchmark☆762Nov 20, 2023Updated 2 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 6 months ago
- ☆254May 30, 2024Updated last year
- 中文原生等级化代码能力测试基准☆15Apr 11, 2024Updated last year
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆126Oct 13, 2025Updated 4 months ago
- OpusFilter - Parallel corpus processing toolkit☆115Updated this week
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 9 months ago
- A tool that locates, downloads, and extracts machine translation corpora☆162Sep 18, 2025Updated 4 months ago
- (NAACL 2024) Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations☆15Apr 14, 2025Updated 10 months ago
- ☆21May 30, 2022Updated 3 years ago
- ☆133Jan 22, 2026Updated 3 weeks ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 8 months ago
- ☆98Sep 25, 2025Updated 4 months ago
- Feature Decay Algorithms☆11Mar 5, 2014Updated 11 years ago
- Codes for "Benchmarking the Generation of Fact Checking Explanations"☆10Aug 16, 2024Updated last year
- On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))☆13Nov 21, 2021Updated 4 years ago
- ☆10Mar 22, 2024Updated last year
- ParCourE - Parallel Corpus Explorer☆12Dec 27, 2021Updated 4 years ago
- 🕸 GlotWeb: Web Indexing for Low-Resource Languages -- under construction.☆17Aug 13, 2025Updated 6 months ago
- Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"☆12Dec 8, 2024Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Jul 3, 2023Updated 2 years ago
- Security Council resolutions in XML AKN4UN format☆17Updated this week
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Aug 27, 2024Updated last year
- ☆14Oct 4, 2024Updated last year
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆295Updated this week
- GEMBA — GPT Estimation Metric Based Assessment☆145Dec 15, 2025Updated last month
- ☆263Aug 1, 2025Updated 6 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- simple translate☆12Mar 7, 2020Updated 5 years ago
- Post-editing Datasets by Rakuten (PEDRa)☆14Jun 23, 2021Updated 4 years ago
- ☆19Sep 16, 2025Updated 4 months ago
- ☆14Jan 4, 2021Updated 5 years ago
- Library for pruning experts per language pair in NLLB-200☆34Jul 7, 2023Updated 2 years ago
- Bilingual term extractor☆59Nov 19, 2025Updated 2 months ago
- State-of-the-art LLM-based translation models.☆577Apr 9, 2025Updated 10 months ago
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆36Jun 7, 2025Updated 8 months ago
- A Neural Framework for MT Evaluation☆713Feb 5, 2026Updated last week