levyfan / sentencepiece-jniLinks
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
☆37Updated 2 years ago
Alternatives and similar repositories for sentencepiece-jni
Users that are interested in sentencepiece-jni are comparing it to the libraries listed below
Sorting:
- Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs☆106Updated 2 years ago
- Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018☆124Updated 5 years ago
- Language-agnostic BERT Sentence Embedding (LaBSE)☆153Updated 5 years ago
- Corpus preprocessing☆99Updated last year
- LM Pretraining with PyTorch/TPU☆136Updated 5 years ago
- Neural Text Generation with Unlikelihood Training☆309Updated 4 years ago
- Dockerized NMT frameworks for nmt-wizard☆39Updated 2 years ago
- Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)☆156Updated 6 years ago
- ☆324Updated 2 years ago
- Decoding platform for machine translation research☆55Updated 6 years ago
- ☆94Updated last year
- One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.☆122Updated 6 years ago
- Code for "Unsupervised Multilingual Word Embedding with Limited Resources using Neural Language Models" and "Learning Contextualised Cros…☆31Updated 2 years ago
- Dataset for the Emerging & Novel Entity NER task (WNUT '17)☆111Updated 3 years ago
- A word alignment tool based on famous GIZA++, extended to support multi-threading, resume training and incremental training.☆165Updated 4 years ago
- PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset☆123Updated 6 years ago
- Byte Pair Encoding for Python!☆231Updated 3 years ago
- MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf☆292Updated 4 years ago
- This repository contains code to replicate the no-longer publicly available Toronto BookCorpus dataset☆49Updated 3 years ago
- Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)☆201Updated 2 years ago
- Multiple Different Natural Language Processing Tasks in a Single Deep Model☆48Updated 6 years ago
- Neural models and instructions on how to reproduce our results for our neural grammatical error correction systems from M. Junczys-Dowmun…☆88Updated 6 years ago
- Implementation of a linear-chain CRF in PyTorch☆97Updated 4 years ago
- "End-to-End Abstractive Summarization for Meetings" paper - Unofficial PyTorch Implementation☆53Updated 2 years ago
- Flexible classic and NeurAl Retrieval Toolkit☆220Updated 2 months ago
- A novel method of constrained decoding for neural NLG (NNLG) models☆84Updated 5 years ago
- A Corpus for Multilingual Document Classification in Eight Languages.☆151Updated 3 years ago
- A Fast ELMo Implementation. (NOT MAINTAIN ANYMORE)☆38Updated 2 years ago
- eXtensible Neural Machine Translation☆187Updated 5 years ago
- ☆219Updated 5 years ago