openlanguagedata/flores

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/openlanguagedata/flores)

openlanguagedata / flores

The FLORES+ Machine Translation Benchmark

☆112

Alternatives and similar repositories for flores

Users that are interested in flores are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

openlanguagedata / seed
View on GitHub
Seed Machine Translation Data
☆34Nov 12, 2024Updated last year
MicrosoftTranslator / NTREX
View on GitHub
NTREX -- News Test References for MT Evaluation
☆87Jun 5, 2024Updated 2 years ago
facebookresearch / flores
View on GitHub
Facebook Low Resource (FLoRes) MT Benchmark
☆771Nov 20, 2023Updated 2 years ago
hsing-wang / Awesome-LLM-MT
View on GitHub
☆254May 30, 2024Updated 2 years ago
SAP / software-documentation-data-set-for-machine-translation
View on GitHub
A parallel evaluation data set of SAP software documentation with document structure annotation
☆15Jun 12, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Helsinki-NLP / OpusFilter
View on GitHub
OpusFilter - Parallel corpus processing toolkit
☆115Jul 1, 2026Updated 3 weeks ago
thammegowda / mtdata
View on GitHub
A tool that locates, downloads, and extracts machine translation corpora
☆167Apr 13, 2026Updated 3 months ago
nlpcuom / English-Tamil-Parallel-Corpus
View on GitHub
☆14Jan 4, 2021Updated 5 years ago
google-research / metricx
View on GitHub
☆147Jul 2, 2026Updated 3 weeks ago
google-research / mt-metrics-eval
View on GitHub
Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.
☆132Apr 23, 2026Updated 3 months ago
salesforce / localization-xml-mt
View on GitHub
A High-Quality Multilingual Dataset for Structured Documentation Translation
☆39May 1, 2025Updated last year
cisnlp / GlotWeb
View on GitHub
[WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages
☆17Apr 14, 2026Updated 3 months ago
alpoktem / bible2speechDB
View on GitHub
Scripts to create speech corpora from open.bible
☆13Jan 3, 2022Updated 4 years ago
google / wmt-mqm-human-evaluation
View on GitHub
☆100Sep 25, 2025Updated 10 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
dayeonki / mt_feedback
View on GitHub
Code for "Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations" [NAACL Findings 2024]
☆14Apr 3, 2026Updated 3 months ago
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
mahfuzibnalam / terminology_evaluation
View on GitHub
☆21May 30, 2022Updated 4 years ago
NJUNLP / MMT-LLM
View on GitHub
☆36Jun 15, 2023Updated 3 years ago
wmt-conference / wmt23-news-systems
View on GitHub
☆14Oct 6, 2025Updated 9 months ago
bicici / FDA
View on GitHub
Feature Decay Algorithms
☆11Mar 5, 2014Updated 12 years ago
CONE-MT / Lego-MT
View on GitHub
☆10Mar 22, 2024Updated 2 years ago
StephAO / olfmlm
View on GitHub
☆18Nov 25, 2022Updated 3 years ago
jiaohuix / nmt_data_tools
View on GitHub
machine translation data process tools
☆10Apr 29, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
OSU-NLP-Group / AttrScore
View on GitHub
Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
☆56Jul 3, 2023Updated 3 years ago
langtech-bsc / mt-evaluation
View on GitHub
A framework for evaluating Machine Translation models.
☆13Apr 21, 2026Updated 3 months ago
MicrosoftTranslator / GEMBA
View on GitHub
GEMBA — GPT Estimation Metric Based Assessment
☆153Dec 15, 2025Updated 7 months ago
rgwt123 / simple-fairseq
View on GitHub
simple translate
☆12Mar 7, 2020Updated 6 years ago
bitextor / bicleaner
View on GitHub
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
☆160Jun 18, 2024Updated 2 years ago
Prompsit / mutnmt
View on GitHub
An educational tool to train, inspect, evaluate and translate using neural engines
☆20Mar 13, 2025Updated last year
fe1ixxu / ALMA
View on GitHub
State-of-the-art LLM-based translation models.
☆590Apr 9, 2025Updated last year
zouharvi / pearmut
View on GitHub
Platform for Evaluating and Reviewing of Multilingual Tasks
☆32Updated this week
cisnlp / Glot500
View on GitHub
[ACL 2023] Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
☆107Apr 14, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
rewicks / ersatz
View on GitHub
☆51Jul 25, 2024Updated 2 years ago
EleanorJiang / BlonDe
View on GitHub
Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …
☆85Sep 21, 2023Updated 2 years ago
facebookresearch / stopes
View on GitHub
A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…
☆309Updated this week
huggingface / finephrase
View on GitHub
Synthetic pretraining data by rephrasing the web
☆25Jun 5, 2026Updated last month
MaxyLee / 3AM
View on GitHub
Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"
☆12Dec 8, 2024Updated last year
yrf1 / LLM-MassiveMulticultureNormsKnowledge-NCLB
View on GitHub
☆20Mar 12, 2025Updated last year
google-research / url-nlp
View on GitHub
☆273Aug 1, 2025Updated 11 months ago