soaxelbrooke / python-bpeLinks
Byte Pair Encoding for Python!
☆232Updated 3 years ago
Alternatives and similar repositories for python-bpe
Users that are interested in python-bpe are comparing it to the libraries listed below
Sorting:
- ☆324Updated 2 years ago
- Unsupervised Statistical Machine Translation☆229Updated 5 years ago
- Neural Text Generation with Unlikelihood Training☆310Updated 4 years ago
- Fast BPE☆678Updated last year
- Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Process…☆249Updated 7 years ago
- Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018☆123Updated last month
- A tool for holistic analysis of language generations systems☆471Updated last month
- Full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.☆160Updated 6 years ago
- A Python wrapper for the ROUGE summarization evaluation package☆249Updated 4 years ago
- New dataset☆308Updated 4 years ago
- Code to reproduce the experiments from the paper.☆102Updated 2 years ago
- Large corpus of uncompressed and compressed sentences from news articles.☆125Updated 8 years ago
- This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 …☆98Updated 5 years ago
- Implementation of a linear-chain CRF in PyTorch☆98Updated 4 years ago
- Python port of Moses tokenizer, truecaser and normalizer☆495Updated last year
- LASER multilingual sentence embeddings as a pip package☆225Updated 2 years ago
- Builds wordpiece(subword) vocabulary compatible for Google Research's BERT☆231Updated 4 years ago
- Open-Source Machine Translation Quality Estimation in PyTorch☆231Updated 3 years ago
- Python code for various NLP metrics☆169Updated 6 years ago
- Unsupervised Question answering via Cloze Translation☆219Updated 3 years ago
- Python library & examples for Masked Language Model Scoring (ACL 2020)☆348Updated 2 years ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…☆560Updated 3 years ago
- eXtensible Neural Machine Translation☆185Updated last month
- 📃Language Model based sentences scoring library☆308Updated 3 years ago
- Easily fine tune GPT-2 to fill in missing text☆201Updated 2 years ago
- Implementation of NeurIPS 19 paper: Paraphrase Generation with Latent Bag of Words☆122Updated 4 years ago
- A Corpus for Multilingual Document Classification in Eight Languages.☆152Updated 3 years ago
- ICLR 2018 Quick-Thought vectors☆204Updated 6 years ago
- One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.☆123Updated 6 years ago
- Easy to use NLP library built on PyTorch and TorchText☆256Updated 5 years ago