marian-nmt/sotastream

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/marian-nmt/sotastream)

marian-nmt / sotastream

A library for data streaming and augmentation

☆22

Alternatives and similar repositories for sotastream

Users that are interested in sotastream are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
kpu / MEMT
View on GitHub
System Combination
☆16Aug 28, 2015Updated 10 years ago
akikoe / nmtrnng
View on GitHub
C++ code of "Learning to Parse and Translate Improves Neural Machine Translation"
☆21May 8, 2017Updated 9 years ago
MicrosoftTranslator / ToShipOrNotToShip
View on GitHub
☆19Dec 16, 2024Updated last year
Dauphine203 / cpp_dauphine
View on GitHub
C++ courses for Paris Dauphine
☆22Nov 23, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
devaansh100 / CLIPTrans
View on GitHub
Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…
☆20Jun 3, 2024Updated 2 years ago
esalesky / visrep
View on GitHub
This repository contains an extension of fairseq for pixel / visual representations of text for machine translation.
☆37Feb 2, 2024Updated 2 years ago
ondrejklejch / MT-ComparEval
View on GitHub
Tool for comparison and evaluation of machine translation.
☆56May 17, 2022Updated 4 years ago
machine-intelligence-laboratory / OptimalNumberOfTopics
View on GitHub
A set of methods for finding an appropriate number of topics in a text collection
☆15Apr 13, 2026Updated 3 months ago
mliarakos / lagom-scalajs-example
View on GitHub
Example Lagom.js application
☆10Jul 10, 2021Updated 5 years ago
project-mandolin / mandolin
View on GitHub
Large-scale Machine Learning using Apache Spark
☆14May 6, 2019Updated 7 years ago
deep-spin / sparse-communication
View on GitHub
☆12Mar 7, 2022Updated 4 years ago
dansoutner / LSTMLM
View on GitHub
Simple LSTM language modelling toolkit
☆10Oct 21, 2022Updated 3 years ago
Unbabel / smaug
View on GitHub
Python package to augment multilingual data
☆15Feb 15, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
PeterisP / LVTagger
View on GitHub
☆18Feb 12, 2026Updated 5 months ago
mingruimingrui / fast-mosestokenizer
View on GitHub
c++ mosestokenizer
☆18Mar 13, 2024Updated 2 years ago
shamilcm / pedra
View on GitHub
Post-editing Datasets by Rakuten (PEDRa)
☆14Jun 23, 2021Updated 5 years ago
thammegowda / mtdata
View on GitHub
A tool that locates, downloads, and extracts machine translation corpora
☆167Apr 13, 2026Updated 3 months ago
mliarakos / lagom-js
View on GitHub
Scala.js client for Lagom
☆12Jan 9, 2022Updated 4 years ago
thammegowda / tika-ner-corenlp
View on GitHub
Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser
☆13Feb 26, 2022Updated 4 years ago
apartresearch / specificityplus
View on GitHub
👩‍💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
☆20Jan 19, 2024Updated 2 years ago
karthikncode / MorphoChain
View on GitHub
A model for unsupervised morphological analysis that integrates orthographic and semantic views of words.
☆13Oct 10, 2023Updated 2 years ago
wroberts / nltk_tgrep
View on GitHub
tgrep2 Searching for NLTK Trees
☆15Oct 28, 2016Updated 9 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jasonmayes / Retraining-TensorFlow-Classifier-Using-Video
View on GitHub
Script to convert all MP4 videos in a zip archive to JPG frames at a desired FPS with unique names. It will then retrain the top layers o…
☆12Jul 6, 2016Updated 10 years ago
kadarakos / hieratt
View on GitHub
Experimenting with Hierarchical Attention Networks from https://arxiv.org/abs/1606.02393 in Keras
☆13Oct 12, 2016Updated 9 years ago
azpoliak / eco
View on GitHub
Code and data related to "Efficient, Compositional, Order-Sensitive n-gram Embeddings" (EACL 2017)
☆15Apr 6, 2017Updated 9 years ago
hainan-xv / zipporah
View on GitHub
☆42Jul 17, 2018Updated 8 years ago
LUMII-AILab / FullStack
View on GitHub
Full Stack of Latvian Language Resources for Natural Language Understanding (NLU) and Generation (NLG)
☆16Oct 20, 2022Updated 3 years ago
maharshi95 / submititnow
View on GitHub
A toolkit to create, launch and monitor SLURM jobs over existing python scripts.
☆12May 13, 2024Updated 2 years ago
bryant / punkt
View on GitHub
Unsupervised multilingual sentence segmentation.
☆21Feb 26, 2021Updated 5 years ago
hplt-project / OpusCleaner
View on GitHub
OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.
☆58Feb 3, 2026Updated 5 months ago
neubig / lader
View on GitHub
A reordering tool for machine translation.
☆15May 3, 2019Updated 7 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Aleph-Alpha-Research / trigrams
View on GitHub
☆60Nov 18, 2025Updated 8 months ago
mt-upc / SHAS
View on GitHub
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
☆44Feb 9, 2023Updated 3 years ago
crockpotveggies / dl4j-examples
View on GitHub
Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)
☆10Aug 16, 2018Updated 7 years ago
hltcoe / patapsco
View on GitHub
Cross language information retrieval pipeline
☆19Jan 12, 2026Updated 6 months ago
huggingface / bert-syntax
View on GitHub
Assessing syntactic abilities of BERT
☆40Jul 18, 2019Updated 7 years ago
Helsinki-NLP / mammoth
View on GitHub
MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki
☆32Jul 21, 2026Updated last week
amazon-science / contrastive-controlled-mt
View on GitHub
Code and data for the IWSLT 2022 shared task on Formality Control for SLT
☆22May 24, 2023Updated 3 years ago