mingruimingrui/fast-mosestokenizer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mingruimingrui/fast-mosestokenizer)

mingruimingrui / fast-mosestokenizer

c++ mosestokenizer

☆18

Alternatives and similar repositories for fast-mosestokenizer

Users that are interested in fast-mosestokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / factored-segmenter
View on GitHub
Unsupervised factor-based text tokenizer for natural-language processing applications
☆17Jul 24, 2020Updated 6 years ago
AppraiseDev / OCELoT
View on GitHub
Project OCELoT: an Open, Collaborative Evaluation Leaderboard of Translations
☆23Jul 11, 2026Updated 2 weeks ago
laboroai / Laboro-ParaCorpus
View on GitHub
Scripts for creating a Japanese-English parallel corpus and training NMT models
☆19Nov 9, 2021Updated 4 years ago
marian-nmt / sotastream
View on GitHub
A library for data streaming and augmentation
☆22May 5, 2025Updated last year
noe / fairseq-tensorboard
View on GitHub
Small utility to monitor fairseq training in tensorboard
☆21Apr 28, 2019Updated 7 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Unbabel / smaug
View on GitHub
Python package to augment multilingual data
☆15Feb 15, 2023Updated 3 years ago
ghrua / NgramRes
View on GitHub
☆23Nov 6, 2022Updated 3 years ago
bryant / punkt
View on GitHub
Unsupervised multilingual sentence segmentation.
☆21Feb 26, 2021Updated 5 years ago
chezou / Mykytea-python
View on GitHub
Python wrapper for KyTea
☆36Mar 30, 2026Updated 3 months ago
modelpredict / language-identification-survey
View on GitHub
Live survey of off-the-shelf language identification tools for python
☆27Apr 13, 2022Updated 4 years ago
aparrish / word-gan-book-generator
View on GitHub
Generating books from GANs trained on bitmaps of whole words
☆23Nov 30, 2019Updated 6 years ago
baoy-nlp / DSS-VAE-pytorch
View on GitHub
Generating Sentences from Disentangled Syntactic and Semantic Spaces
☆11Jun 24, 2019Updated 7 years ago
Shavvimal / fingerprint-browser
View on GitHub
☆12Nov 17, 2023Updated 2 years ago
infernet-h2020 / DCAlign
View on GitHub
☆12Mar 12, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
uiwjs / next-remove-imports
View on GitHub
The default behavior is to remove all .less/.css/.scss/.sass/.styl imports from all packages in node_modules.
☆17Apr 21, 2026Updated 3 months ago
ZurichNLP / nmtscore
View on GitHub
A library of translation-based text similarity measures
☆25Dec 11, 2023Updated 2 years ago
wlin12 / SMMTT
View on GitHub
Social Media Machine Translation Toolkit
☆21Sep 13, 2013Updated 12 years ago
bicici / FDA
View on GitHub
Feature Decay Algorithms
☆11Mar 5, 2014Updated 12 years ago
jcmf / glulx-strings
View on GitHub
extract raw text fragments from interactive fiction glulx gblorb Inform
☆19Apr 29, 2020Updated 6 years ago
wmt-conference / wmt-format-tools
View on GitHub
Tools for formatting WMT hypothesis and test sets in XML
☆27Apr 18, 2025Updated last year
browsermt / students
View on GitHub
Efficient teacher-student models and scripts to make them
☆57Dec 16, 2023Updated 2 years ago
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
thevasudevgupta / transformers-adapters
View on GitHub
This repositary hosts my experiments for the project, I did with OffNote Labs.
☆10Apr 12, 2021Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
snover / terp
View on GitHub
TER-plus Machine Translation metric.
☆31May 23, 2022Updated 4 years ago
1a3orn / very-simple-moe
View on GitHub
Extremely simple MoE implementation, mostly based off Switch Transformer
☆13Feb 26, 2024Updated 2 years ago
indiejoseph / tf-ran-cell
View on GitHub
Recurrent Additive Networks for Tensorflow
☆16Jun 30, 2017Updated 9 years ago
thammegowda / mtdata
View on GitHub
A tool that locates, downloads, and extracts machine translation corpora
☆167Apr 13, 2026Updated 3 months ago
ehsanasgari / 1000Langs
View on GitHub
Creating super-parallel corpora of more than 1500+ unique languages for NLP research
☆33Dec 8, 2022Updated 3 years ago
bitextor / bifixer
View on GitHub
Tool to fix bitexts and tag near-duplicates for removal
☆35Sep 4, 2025Updated 10 months ago
BayesForDays / nontology
View on GitHub
Matrix tools for building and inspecting latent spaces
☆26Aug 19, 2018Updated 7 years ago
tjingrant / onnx-tf
View on GitHub
Experimental Tensorflow Backend for ONNX
☆11Nov 21, 2017Updated 8 years ago
FranciscoDA / ps2mcfs
View on GitHub
FUSE driver that allows mounting Sony PlayStation 2 memory card files (either from an emulator or obtained from real hardware) into your …
☆22May 2, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
CanCLID / canto-filter
View on GitHub
粵文語料篩選器 Cantonese text filter
☆43Feb 4, 2026Updated 5 months ago
JDonner / TreeKernel
View on GitHub
C++ implementation of Alessandro Moschitti's Tree Kernel algorithm, from "Making Tree Kernels Practical for Natural Language Learning"
☆12Oct 10, 2019Updated 6 years ago
222464 / MiniNeoRL
View on GitHub
Simple, small, fully-connected Python version of NeoRL
☆11Jan 29, 2016Updated 10 years ago
OpenNMT / nmt-wizard
View on GitHub
Launch NMT tasks on the cloud
☆13May 8, 2023Updated 3 years ago
Chillsmeit / jammy-change-gdm-background
View on GitHub
Change the GDM background of Ubuntu and Pop_OS! 22.04 Jammy
☆14Jun 20, 2023Updated 3 years ago
hkwi / sqlalchemy_gevent
View on GitHub
sqlalchemy dialect adaptor for gevent to work in non-blocking mode
☆20Feb 22, 2018Updated 8 years ago
dugu9sword / manytasks
View on GitHub
A tool for deploying many tasks automatically.
☆11Jan 16, 2025Updated last year