marian-nmt / sotastreamView external linksLinks
A library for data streaming and augmentation
☆21May 5, 2025Updated 9 months ago
Alternatives and similar repositories for sotastream
Users that are interested in sotastream are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- Post-editing Datasets by Rakuten (PEDRa)☆14Jun 23, 2021Updated 4 years ago
- ☆19Dec 16, 2024Updated last year
- System Combination☆16Aug 28, 2015Updated 10 years ago
- c++ mosestokenizer☆18Mar 13, 2024Updated last year
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Jan 19, 2024Updated 2 years ago
- C++ code of "Learning to Parse and Translate Improves Neural Machine Translation"☆21May 8, 2017Updated 8 years ago
- ☆19Jul 21, 2020Updated 5 years ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆51Apr 22, 2025Updated 9 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated last year
- Unsupervised multilingual sentence segmentation.☆21Feb 26, 2021Updated 4 years ago
- Project OCELoT: an Open, Collaborative Evaluation Leaderboard of Translations☆23Nov 5, 2025Updated 3 months ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- Library for fast text representation and classification.☆31Jan 9, 2024Updated 2 years ago
- ☆21Feb 13, 2023Updated 3 years ago
- A tool that locates, downloads, and extracts machine translation corpora☆162Sep 18, 2025Updated 4 months ago
- Tool to fix bitexts and tag near-duplicates for removal☆34Sep 4, 2025Updated 5 months ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆126Oct 13, 2025Updated 4 months ago
- Tools for formatting WMT hypothesis and test sets in XML☆27Apr 18, 2025Updated 9 months ago
- Adaptive Machine Translation with Large Language Models☆32Jan 4, 2025Updated last year
- ☆35Jun 15, 2023Updated 2 years ago
- Human evaluation results and translation output for the Translator Human Parity Data release☆37Mar 19, 2018Updated 7 years ago
- DiagnoSys is a comprehensive web application that provides advanced detection and analysis for various health conditions. This project le…☆14May 6, 2024Updated last year
- ☆11Mar 11, 2024Updated last year
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 9 months ago
- LaNMT: Latent-variable Non-autoregressive Neural Machine Translation with Deterministic Inference☆79Aug 27, 2021Updated 4 years ago
- Corpus preprocessing☆99Mar 16, 2024Updated last year
- A tool for holistic analysis of language generations systems☆471Sep 22, 2025Updated 4 months ago
- The sparse Bayesian learning sandbox☆11Jul 4, 2021Updated 4 years ago
- ☆10Nov 16, 2023Updated 2 years ago
- ☆14May 14, 2019Updated 6 years ago
- [ACM MM 2024 (Oral)] Official PyTorch Implementation of Paper "MovingColor: Seamless Fusion of Fine-grained Video Color Enhancement"☆11Dec 30, 2024Updated last year
- [Advanced Photonics Research, 2021] Control tightly focused fields via manipulating pupil functions☆10Dec 25, 2024Updated last year
- This repository shows how to implement a basic model for multimodal entailment.☆10Aug 17, 2021Updated 4 years ago
- Tensorflow implementation of the paper "Fast Compressive Sensing Using Generative Model with Structed Latent Variables"☆10Apr 7, 2020Updated 5 years ago
- ☆10Dec 12, 2022Updated 3 years ago
- SHAS: Approaching optimal Segmentation for End-to-End Speech Translation☆41Feb 9, 2023Updated 3 years ago