Esukhia / Corpora
repo for Tibetan corpora
☆21Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Corpora
- ✒️ དག་བྱེད། Dakje, improving your spelling and readability☆11Updated 2 years ago
- ☆18Updated 7 years ago
- 🦜 NLP for Tibetan, in Python.☆32Updated last year
- 🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python☆58Updated 2 months ago
- 😎 Curated list of Tibetan NLP projects☆33Updated 4 years ago
- Linguistically analyzed Classical Tibetan texts☆24Updated 3 years ago
- <u><a href="https://circse.github.io/LT4HALA/" style="color: white">Workshop on Language Technologies for Historical and Ancient Language…☆32Updated 5 months ago
- Dataset for TALLIP2019 paper "Ancient-Modern Chinese Translation with a New Large Training Dataset"☆22Updated 2 years ago
- ☆42Updated 6 years ago
- 基于Pytorch 1.0 实现的中文断句与标点符号恢复。☆55Updated 5 years ago
- TIP-LAS: An open source toolkit for Tibetan word segmentation and part-of-speech tagging☆80Updated last year
- An open-access corpus of conversational bilingual speech in Cantonese and English☆40Updated 2 years ago
- Code for the paper "Multi-Task Learning for Domain-General Spoken Disfluency Detection in Dialogue Systems" (Igor Shalyminov, Arash Eshgh…☆24Updated last year
- We use phonetics as a feature to create a joint semantic-phonetic embedding and improve the neural machine translation between Chinese an…☆11Updated 3 years ago
- Repository of "An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction" (EMNLP-IJCNLP 2019)☆68Updated 4 years ago
- An open-source classical Chinese information processing toolkit developed by Tsinghua Natural Language Processing Group☆47Updated 5 years ago
- Code and data of the paper "MCTS: A Multi-Reference Chinese Text Simplification Dataset".☆28Updated 5 months ago
- Efficient Low-Memory Aligner☆137Updated 2 months ago
- A PyTorch implementation of "Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study"☆50Updated 5 years ago
- A grammatical error correction reading list maintained by Beijing Language and Culture University Natural Language Processing Group☆25Updated 3 years ago
- ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET☆58Updated 2 years ago
- Punctuation restoration in ASR text☆33Updated 5 years ago
- code of our EMNLP-19 Paper, CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding☆28Updated 5 years ago
- Use bert to predict punctuation on IWSLT2012 and The People's Daily 2014☆65Updated 4 years ago
- Code of zlyang's master dissertation for Chinese grammatical error correction.☆34Updated 5 years ago
- uyghur text resource crawled from website☆12Updated 8 years ago
- Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding☆24Updated last year
- TVsub: DCU-Tencent Chinese-English Dialogue Corpus☆45Updated 6 years ago
- [ACL'21] Data for "An In-depth Study on Internal Structure of Chinese Words".☆14Updated 3 years ago