levyfan / sentencepiece-jniLinks
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
☆37Updated 2 years ago
Alternatives and similar repositories for sentencepiece-jni
Users that are interested in sentencepiece-jni are comparing it to the libraries listed below
Sorting:
- Subword Language Model for Query Auto-Completion☆67Updated 5 years ago
- Corpus preprocessing☆97Updated last year
- Word Piece Model python light version with functions tokenize/save/load☆64Updated 4 years ago
- Java port of c++ version of facebook fasttext☆14Updated 5 years ago
- ☆42Updated 6 years ago
- reference pytorch code for intent classification☆44Updated 8 months ago
- Automatic extraction of edited sentences from text edition histories.☆83Updated 3 years ago
- Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018☆124Updated 5 years ago
- Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)☆157Updated 6 years ago
- Neural macine translation soft alignment visualisations for web and command line☆72Updated 3 years ago
- This repo includes extensions to the Stanford Dialogue Corpus. It contains crowd-sourced rewrites to facilitate research in dialogue stat…☆90Updated 5 years ago
- Assessing syntactic abilities of BERT☆39Updated 5 years ago
- We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scal…☆81Updated 3 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆119Updated 3 years ago
- Dialog State Tracking Challenge 6 (DSTC6)☆54Updated 7 years ago
- LM Pretraining with PyTorch/TPU☆134Updated 5 years ago
- Triangular-chain CRF☆25Updated 9 years ago
- Implementation of pQRNN in PyTorch☆46Updated 3 years ago
- Phrase-Indexed Question Answering (PIQA)☆94Updated 6 years ago
- MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.☆152Updated 2 years ago
- A word alignment tool based on famous GIZA++, extended to support multi-threading, resume training and incremental training.☆164Updated 4 years ago
- PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset☆123Updated 5 years ago
- TOWARDS AN AUTOMATIC TURING TEST: LEARNING TO EVALUATE DIALOGUE RESPONSES☆29Updated 7 years ago
- Language modeling scripts based on TensorFlow☆58Updated 5 years ago
- A collection of resources on using BERT (https://arxiv.org/abs/1810.04805 ) and related Language Models in production environments.☆96Updated 4 years ago
- Implementation of unsupervised smoothed inverse frequency (Best Paper, Repl4NLP @ ACL 2018)☆77Updated 6 years ago
- A Fast ELMo Implementation. (NOT MAINTAIN ANYMORE)☆38Updated 2 years ago
- Symphony Machine Translation☆38Updated 5 years ago
- Decoding platform for machine translation research☆55Updated 5 years ago
- A Corpus for Multilingual Document Classification in Eight Languages.☆151Updated 3 years ago