qhungngo / EVBCorpus
The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.
☆42Updated 5 years ago
Alternatives and similar repositories for EVBCorpus:
Users that are interested in EVBCorpus are comparing it to the libraries listed below
- Neural Machine Translation system for English to Vietnamese (IWSLT'15 English-Vietnamese data)☆60Updated 5 years ago
- VnDT: A Vietnamese Dependency Treebank☆21Updated 3 years ago
- ☆42Updated 6 years ago
- Unsupervised parallel sentence extraction from comparable corpora☆16Updated 5 years ago
- A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT☆27Updated 4 years ago
- Scripts to preprocess training and test data and to run fast_align and giza☆108Updated 3 years ago
- Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018☆122Updated 4 years ago
- Efficient Low-Memory Aligner☆143Updated 3 months ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated last month
- We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scal…☆81Updated 3 years ago
- NTREX -- News Test References for MT Evaluation☆83Updated 10 months ago
- Framework for neural-based Quality Estimation☆42Updated 4 years ago
- ☆21Updated 2 years ago
- BERT-based joint intent detection and slot filling with intent-slot attention mechanism (INTERSPEECH 2021)☆87Updated 9 months ago
- ☆29Updated 4 years ago
- Lexically Constrained Neural Machine Translation with Levenshtein Transformer☆39Updated 4 years ago
- This repository is used to publish our codes for the conference paper "Vietnamese punctuation prediction using deep neural networks" at S…☆10Updated 4 years ago
- Repository of "An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction" (EMNLP-IJCNLP 2019)☆68Updated 5 years ago
- cLang-8 is a dataset for grammatical error correction.☆104Updated 2 years ago
- RIBES is an automatic evaluation metric for machine translation.☆11Updated 7 years ago
- PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)☆42Updated 9 months ago
- Vietnamese Treebank☆26Updated 5 years ago
- Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.☆41Updated last year
- TUFS Asian Language Parallel Corpus☆50Updated last year
- Code for AAAI 2021 paper "Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance"☆25Updated 2 years ago
- Multilingual Quality Estimation and Automatic Post-editing Dataset☆41Updated 3 years ago
- Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021☆60Updated 3 years ago
- Improved version of GECToR☆60Updated last year
- ☆22Updated 4 years ago
- ☆36Updated 2 years ago