kakaobrain / word2word
Easy-to-use word-to-word translations for 3,564 language pairs.
☆362Updated 4 years ago
Alternatives and similar repositories for word2word:
Users that are interested in word2word are comparing it to the libraries listed below
- A tool for holistic analysis of language generations systems☆468Updated 3 years ago
- Evaluating Cross-lingual Sentence Representations☆450Updated 3 years ago
- Unsupervised Statistical Machine Translation☆229Updated 4 years ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…☆557Updated 3 years ago
- Fast BPE☆668Updated 9 months ago
- Bitextor generates translation memories from multilingual websites☆292Updated 4 months ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆361Updated last year
- TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted…☆248Updated 9 years ago
- Unsupervised Neural Machine Translation☆474Updated 4 years ago
- Preprocessing Library for Natural Language Processing☆161Updated 2 years ago
- ☆321Updated 2 years ago
- GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors☆505Updated 5 years ago
- Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)☆1,207Updated 6 months ago
- LASER multilingual sentence embeddings as a pip package☆224Updated last year
- A Corpus for Multilingual Document Classification in Eight Languages.☆151Updated 2 years ago
- A TensorFlow implementation of Neural Sequence Labeling model, which is able to tackle sequence labeling tasks such as POS Tagging, Chunk…☆234Updated 6 years ago
- Simple, fast unsupervised word aligner☆750Updated 2 years ago
- This repository contains various ways to calculate sentence vector similarity using NLP models☆199Updated 4 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆156Updated 9 months ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆733Updated 7 months ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆222Updated 2 years ago
- A sentence segmenter that actually works!☆305Updated 4 years ago
- Code for the AllenNLP demo.☆197Updated 2 years ago
- Python port of Moses tokenizer, truecaser and normalizer☆492Updated 10 months ago
- ☆362Updated 2 years ago
- An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)☆443Updated 3 weeks ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆352Updated 2 years ago
- A framework to learn cross-lingual word embedding mappings☆647Updated last year
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …☆231Updated 2 years ago
- A list of Neural MT implementations☆362Updated 2 years ago