levyfan / sentencepiece-jni
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
☆37Updated 2 years ago
Alternatives and similar repositories for sentencepiece-jni:
Users that are interested in sentencepiece-jni are comparing it to the libraries listed below
- Dockerized NMT frameworks for nmt-wizard☆39Updated 2 years ago
- Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018☆122Updated 4 years ago
- Corpus preprocessing☆96Updated last year
- ☆42Updated 6 years ago
- Subword Language Model for Query Auto-Completion☆67Updated 5 years ago
- Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)☆158Updated 5 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆119Updated 3 years ago
- LM Pretraining with PyTorch/TPU☆134Updated 5 years ago
- Data and code for the paper "End-to-End Slot Alignment and Recognition for Cross-Lingual NLU" (Accepted to EMNLP 2020)☆24Updated 3 years ago
- This repo includes extensions to the Stanford Dialogue Corpus. It contains crowd-sourced rewrites to facilitate research in dialogue stat…☆89Updated 5 years ago
- ☆21Updated 5 years ago
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆136Updated last year
- We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scal…☆81Updated 3 years ago
- A Benchmark Dataset for Understanding Disfluencies in Question Answering☆62Updated 3 years ago
- Word Piece Model python light version with functions tokenize/save/load☆66Updated 4 years ago
- Decoding platform for machine translation research☆55Updated 5 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"☆40Updated 6 years ago
- This project attempts to maintain the SOTA performance in machine translation☆108Updated 4 years ago
- Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task☆92Updated 5 years ago
- Neural models and instructions on how to reproduce our results for our neural grammatical error correction systems from M. Junczys-Dowmun…☆89Updated 5 years ago
- ☆33Updated 8 years ago
- Symphony Machine Translation☆38Updated 4 years ago
- Multiple Different Natural Language Processing Tasks in a Single Deep Model☆48Updated 6 years ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆63Updated 4 years ago
- An extension of word2vec to learn phrase embeddings☆75Updated 6 years ago
- [EMNLP 2018] Towards Universal Dialogue State Tracking☆42Updated 5 years ago
- Examples, tutorials and use cases for Marian, including our WMT-2017/18 baselines.☆77Updated 2 years ago
- A program to choose transfer languages for cross-lingual learning☆72Updated last year
- Phrase-Indexed Question Answering (PIQA)☆94Updated 5 years ago