wisesight / newmm-tokenizer
Standalone Dictionary-based, Maximum Matching + Thai Character Cluster (newmm) tokenizer extracted from PyThaiNLP
☆13Updated 2 years ago
Related projects: ⓘ
- A Dataset for Thai Text Summarization with over 310K articles.☆28Updated last year
- ☆11Updated 5 years ago
- CRF syllable segmenter for Thai☆26Updated 4 months ago
- ☆14Updated 4 years ago
- Thai Named Entity Recognition☆53Updated last year
- NLP For Thai☆25Updated 4 months ago
- ☆36Updated 3 years ago
- Thai Word Segmentation and Part-of-Speech Tagging with Deep Learning☆41Updated 7 years ago
- ☆11Updated last year
- ☆14Updated 5 years ago
- Thai News Dataset from Thai government website.☆12Updated this week
- Word Similarity Datasets for Thai Language☆18Updated 4 years ago
- Pytorch implementation of paper: Thai Nested Named Entity Recognition☆39Updated 8 months ago
- Thai Law Dataset (Act of Parliament)☆17Updated 3 years ago
- Scrape, clean and explore ThaiME dataset☆12Updated 4 years ago
- scripts for cleaning and creating train/validation/test splits for Thai commonvoice☆10Updated 3 years ago
- Thai Social Media Sentiment Dataset☆77Updated 3 years ago
- Handling Cross- and Out-of-Domain Samples in Thai Word Segmentation (ACL 2021 Findings).☆30Updated 7 months ago
- Thai PDPA Website (Unofficial)☆10Updated last year
- News Article Corpus from Prachathai.com☆16Updated 3 years ago
- Dataset for fake news detection in healthcare domain☆12Updated 2 years ago
- Parallel Universal Dependencies.☆14Updated 4 months ago
- ☆9Updated last year
- A Fast and Accurate Neural Thai Word Segmenter☆79Updated 4 months ago
- NLP course at Chulalongkorn University 2019☆22Updated 5 years ago
- ☆15Updated 2 years ago
- Collection of Wongnai's datasets☆75Updated 5 years ago
- PyThaiNLP For spaCy☆13Updated last year
- Open Thai Wikipedia QA Dataset made by iApp Technology☆14Updated 3 years ago
- The CMU Link Grammar natural language parser☆12Updated 3 months ago