Socret360 / joint-khmer-word-segmentation-and-pos-tagging
A Keras implementation of a deep learning network to simultaneously perform Word Segmentation and Part-of-Speech (POS) Tagging introduced by Bouy et al. in the paper Joint Khmer Word Segmentation and Part-of-Speech Tagging Using Deep Learning.
☆11Updated 3 years ago
Alternatives and similar repositories for joint-khmer-word-segmentation-and-pos-tagging
Users that are interested in joint-khmer-word-segmentation-and-pos-tagging are comparing it to the libraries listed below
Sorting:
- Khmer language processing toolkit☆72Updated last year
- Word segmentation using Conditional Random Fields (CRF) for Khmer document☆29Updated 4 years ago
- A large collection of Khmer language resources. Khmer is a language used by Cambodia.☆114Updated last week
- khPOS (Khmer Part-of-Speech) Corpus for Khmer NLP Research and Developments☆26Updated last year
- Khmer unicode text data for unsupervised learning language model☆21Updated 4 years ago
- ☆14Updated 6 years ago
- Machine Reading Comprehension special for the Vietnamese language☆40Updated 3 years ago
- ☆14Updated 4 years ago
- A Robustly Optimized BERT Pretraining Approach for Vietnamese☆32Updated 9 months ago
- The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.☆42Updated 5 years ago
- PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)☆43Updated 9 months ago
- ☆69Updated 2 years ago
- TUFS Asian Language Parallel Corpus☆50Updated 2 years ago
- BERT-based joint intent detection and slot filling with intent-slot attention mechanism (INTERSPEECH 2021)☆87Updated 9 months ago
- PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)☆142Updated 4 months ago
- ☆16Updated 2 years ago
- The FLORES+ Machine Translation Benchmark☆102Updated 6 months ago
- A dataset for Vietnamese Spelling Correction☆15Updated 3 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆30Updated 3 years ago
- VNHSGE: Vietnamese High School Graduation Examination Dataset for Large Language Models☆25Updated last year
- BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (INTERSPEECH 2022)☆103Updated 9 months ago
- cLang-8 is a dataset for grammatical error correction.☆104Updated 2 years ago
- Khmer wordlist for line and word breaking☆36Updated 3 years ago
- ☆69Updated last year
- CVPR 2022: Table Structure Recognition☆39Updated 3 years ago
- Automatic Post-Editing for Vietnamese☆12Updated 3 years ago
- ViSen is library to format tone of Vietnamese sentences☆20Updated 3 years ago
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆21Updated 9 months ago
- ☆138Updated last year
- Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn☆26Updated 2 years ago