qhungngo/EVBCorpus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qhungngo/EVBCorpus)

qhungngo / EVBCorpus

The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.

☆52

Alternatives and similar repositories for EVBCorpus

Users that are interested in EVBCorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stefan-it / nmt-en-vi
View on GitHub
Neural Machine Translation system for English to Vietnamese (IWSLT'15 English-Vietnamese data)
☆64Jul 22, 2019Updated 7 years ago
binhvq / vietdict106k
View on GitHub
☆13Aug 2, 2021Updated 4 years ago
hangyav / UnsupPSE
View on GitHub
Unsupervised parallel sentence extraction from comparable corpora
☆16Aug 6, 2019Updated 6 years ago
matbahasa / TALPCo
View on GitHub
TUFS Asian Language Parallel Corpus
☆53May 1, 2023Updated 3 years ago
duyvuleo / Transformer-DyNet
View on GitHub
An Implementation of Transformer (Attention Is All You Need) in DyNet
☆64Nov 30, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tnq177 / witwicky
View on GitHub
Witwicky: An implementation of Transformer in PyTorch.
☆22Aug 17, 2020Updated 5 years ago
bicici / FDA
View on GitHub
Feature Decay Algorithms
☆11Mar 5, 2014Updated 12 years ago
Unbabel / BConTrasT
View on GitHub
☆20Aug 17, 2021Updated 4 years ago
ZurichNLP / domain-robustness
View on GitHub
☆13Dec 11, 2020Updated 5 years ago
cisnlp / parcoure
View on GitHub
ParCourE - Parallel Corpus Explorer
☆12Dec 27, 2021Updated 4 years ago
MaxyLee / 3AM
View on GitHub
Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"
☆12Dec 8, 2024Updated last year
NTT123 / viwik18
View on GitHub
Vietnamese Text Dataset - Wikipedia vi 2018
☆16Feb 18, 2019Updated 7 years ago
shmulvad / zero-for-ner
View on GitHub
Zero-Shot Learning in Named Entity Recognition with Common Sense Knowledge
☆17Nov 16, 2021Updated 4 years ago
phuonglh / ai.vitk.ner
View on GitHub
Vietnamese Named Entity Recognition
☆31Oct 12, 2020Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
VietnamAIHub / GPTViet
View on GitHub
This project aims to develop a bilingual foundation model with both language and multimodal capabilities. The objective is to enhance an …
☆16Dec 4, 2025Updated 7 months ago
marliesvanderwees / dds-nmt
View on GitHub
Dynamic data selection for neural machine translation
☆20Jan 28, 2018Updated 8 years ago
neulab / word-embeddings-for-nmt
View on GitHub
Supplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018
☆123Sep 22, 2025Updated 10 months ago
undertheseanlp / corpus.viwiki
View on GitHub
Vietnamese Wikipedia Corpus
☆20May 18, 2017Updated 9 years ago
mettamind-ai / physics_of_llms
View on GitHub
Các thí nghiệm liên quan tới LLMs cho tiếng Việt (insprised by Physics of LLMs Series)
☆11Oct 21, 2024Updated last year
raymondhs / constrained-levt
View on GitHub
Lexically Constrained Neural Machine Translation with Levenshtein Transformer
☆40Jul 14, 2020Updated 6 years ago
duyvuleo / VNTC
View on GitHub
A Large-scale Vietnamese News Text Classification Corpus
☆109Sep 24, 2019Updated 6 years ago
rwsproat / text-normalization-data
View on GitHub
Links to data used in Sproat & Jaitly (https://arxiv.org/abs/1611.00068) experiments.
☆77Jul 9, 2021Updated 5 years ago
ImperialNLP / MMT-Delib
View on GitHub
☆10Dec 21, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
XuezheMax / NeuroNLP
View on GitHub
Deep neural models for core NLP tasks
☆13Nov 9, 2017Updated 8 years ago
heraclex12 / Viwiki-spelling
View on GitHub
A dataset for Vietnamese Spelling Correction
☆17Sep 27, 2021Updated 4 years ago
jerrygaoLondon / jgtextrank
View on GitHub
jgtextrank: Yet another Python implementation of TextRank
☆13Nov 27, 2019Updated 6 years ago
Deep1994 / An-Attentive-Neural-Model-for-labeling-Adverse-Drug-Reactions
View on GitHub
An Attentive Neural Sequence Labeling Model for Adverse Drug Reactions Mentions Extraction
☆15Jan 17, 2020Updated 6 years ago
mahfuzibnalam / terminology_evaluation
View on GitHub
☆21May 30, 2022Updated 4 years ago
amittai / cynical
View on GitHub
Cynical data selection
☆20Jan 16, 2021Updated 5 years ago
jackbandy / bookcorpus-datasheet
View on GitHub
Documentation effort for the BookCorpus dataset
☆34Jun 2, 2021Updated 5 years ago
currentslab / fastlangid
View on GitHub
fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…
☆43Dec 6, 2022Updated 3 years ago
EdinburghNLP / wmt17-scripts
View on GitHub
☆20Jun 14, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SkyAndCloud / awesome-transformer
View on GitHub
This repo is not maintained. For latest version, please visit https://github.com/ictnlp. A collection of transformer's guides, implementa…
☆44Dec 5, 2018Updated 7 years ago
Mrbuchixiangcai / Linux-kernel_0.11
View on GitHub
赵炯老师版本的注释版Linux内核源码
☆10Oct 18, 2018Updated 7 years ago
VinAIResearch / PhoMT
View on GitHub
PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)
☆52Jun 3, 2025Updated last year
teslacool / SCA
View on GitHub
Soft Contextual Data Augmentation
☆39Jul 25, 2024Updated 2 years ago
longyuewangdcu / tvsub
View on GitHub
TVsub: DCU-Tencent Chinese-English Dialogue Corpus
☆47Feb 14, 2018Updated 8 years ago
kelp404 / mongoose-profiler
View on GitHub
A performance tuning tool for Mongoose. Show explain results when the query is slow.
☆12Jun 17, 2023Updated 3 years ago
Helsinki-NLP / OpusFilter
View on GitHub
OpusFilter - Parallel corpus processing toolkit
☆115Jul 1, 2026Updated 3 weeks ago