A library for data streaming and augmentation
☆21May 5, 2025Updated 10 months ago
Alternatives and similar repositories for sotastream
Users that are interested in sotastream are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- Post-editing Datasets by Rakuten (PEDRa)☆14Jun 23, 2021Updated 4 years ago
- ☆20Dec 16, 2024Updated last year
- System Combination☆16Aug 28, 2015Updated 10 years ago
- c++ mosestokenizer☆18Mar 13, 2024Updated last year
- ☆19Jul 21, 2020Updated 5 years ago
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Jan 19, 2024Updated 2 years ago
- C++ code of "Learning to Parse and Translate Improves Neural Machine Translation"☆21May 8, 2017Updated 8 years ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆51Apr 22, 2025Updated 10 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated last year
- Unsupervised multilingual sentence segmentation.☆21Feb 26, 2021Updated 5 years ago
- Project OCELoT: an Open, Collaborative Evaluation Leaderboard of Translations☆23Nov 5, 2025Updated 4 months ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆58Feb 3, 2026Updated last month
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆31Updated this week
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- ☆59Nov 18, 2025Updated 3 months ago
- ☆21Feb 13, 2023Updated 3 years ago
- Library for fast text representation and classification.☆31Jan 9, 2024Updated 2 years ago
- A tool that locates, downloads, and extracts machine translation corpora☆162Sep 18, 2025Updated 5 months ago
- Tool for comparison and evaluation of machine translation.☆56May 17, 2022Updated 3 years ago
- ☆26Jan 9, 2023Updated 3 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆34Sep 4, 2025Updated 6 months ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆126Oct 13, 2025Updated 4 months ago
- Tools for formatting WMT hypothesis and test sets in XML☆27Apr 18, 2025Updated 10 months ago
- Adaptive Machine Translation with Large Language Models☆32Jan 4, 2025Updated last year
- Reader Translator Generator - NMT toolkit based on pytorch☆32Sep 12, 2023Updated 2 years ago
- ☆35Jun 15, 2023Updated 2 years ago
- Arduino library for accessing the AT24CXXX eeprom☆16Jul 12, 2025Updated 7 months ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- This repo contains the code to reproduce figures in my dissertation "Passive Imaging and Characterization of the Subsurface With Distribu…☆10Jun 14, 2018Updated 7 years ago
- DiagnoSys is a comprehensive web application that provides advanced detection and analysis for various health conditions. This project le…☆14May 6, 2024Updated last year
- ☆11Mar 11, 2024Updated last year
- Human evaluation results and translation output for the Translator Human Parity Data release☆37Mar 19, 2018Updated 7 years ago
- Assessing syntactic abilities of BERT☆40Jul 18, 2019Updated 6 years ago
- Corpus preprocessing☆100Mar 16, 2024Updated last year
- A tool for holistic analysis of language generations systems☆471Sep 22, 2025Updated 5 months ago
- ☆14May 14, 2019Updated 6 years ago
- Tensorflow implementation of the paper "Fast Compressive Sensing Using Generative Model with Structed Latent Variables"☆10Apr 7, 2020Updated 5 years ago