levyfan / sentencepiece-jniLinks
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
☆38Updated 2 years ago
Alternatives and similar repositories for sentencepiece-jni
Users that are interested in sentencepiece-jni are comparing it to the libraries listed below
Sorting:
- This repo includes extensions to the Stanford Dialogue Corpus. It contains crowd-sourced rewrites to facilitate research in dialogue stat…☆92Updated 6 years ago
- Byte Pair Encoding for Python!☆232Updated 3 years ago
- Repository for the paper "Optimal Subarchitecture Extraction for BERT"☆470Updated 3 years ago
- Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018☆123Updated last month
- Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs☆106Updated 3 years ago
- Resources for the OpenNMT hackathon☆51Updated 6 years ago
- Flexible classic and NeurAl Retrieval Toolkit☆220Updated 4 months ago
- Corpus preprocessing☆99Updated last year
- LM Pretraining with PyTorch/TPU☆136Updated 6 years ago
- Implementation of unsupervised smoothed inverse frequency (Best Paper, Repl4NLP @ ACL 2018)☆78Updated 6 years ago
- Facebook's FastText for Java☆81Updated 7 years ago
- Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)☆156Updated 6 years ago
- Dockerized NMT frameworks for nmt-wizard☆39Updated 2 years ago
- Neural Text Generation with Unlikelihood Training☆309Updated 4 years ago
- Decoding platform for machine translation research☆54Updated 6 years ago
- Implementation of pQRNN in PyTorch☆46Updated 4 years ago
- ☆325Updated 2 years ago
- A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)☆176Updated 2 years ago
- Subword Language Model for Query Auto-Completion☆67Updated 6 years ago
- Language-agnostic BERT Sentence Embedding (LaBSE)☆153Updated 5 years ago
- This repository contains the FewGLUE dataset for few-shot natural language understanding.☆160Updated 5 years ago
- Java port of c++ version of facebook fasttext☆15Updated 6 years ago
- A collection of resources on using BERT (https://arxiv.org/abs/1810.04805 ) and related Language Models in production environments.☆96Updated 4 years ago
- ICLR 2018 Quick-Thought vectors☆204Updated 6 years ago
- Phrase-Indexed Question Answering (PIQA)☆94Updated 6 years ago
- One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.☆123Updated 6 years ago
- ☆219Updated 5 years ago
- MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf☆294Updated 4 years ago
- Code to reproduce experiments in the paper "Task-Oriented Dialogue as Dataflow Synthesis" (TACL 2020).☆308Updated last year
- Odinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet pow…☆72Updated last year