Standalone Dictionary-based, Maximum Matching + Thai Character Cluster (newmm) tokenizer extracted from PyThaiNLP
☆13Jan 6, 2022Updated 4 years ago
Alternatives and similar repositories for newmm-tokenizer
Users that are interested in newmm-tokenizer are comparing it to the libraries listed below
Sorting:
- ☆14Jun 22, 2020Updated 5 years ago
- scripts for cleaning and creating train/validation/test splits for Thai commonvoice☆12Sep 2, 2021Updated 4 years ago
- Shan Natural Language Processing tools inspired by PythaiNLP☆14Updated this week
- Dataset for fake news detection in healthcare domain☆12Apr 30, 2022Updated 3 years ago
- Scrape, clean and explore ThaiME dataset☆12Jul 29, 2020Updated 5 years ago
- Make Pad Thai From few-shot learning 😉☆12Jan 19, 2023Updated 3 years ago
- A Dataset for Thai Text Summarization with over 310K articles.☆29Feb 4, 2023Updated 3 years ago
- Parallel Universal Dependencies.☆15Nov 12, 2025Updated 3 months ago
- ☆40Feb 1, 2023Updated 3 years ago
- ☆17May 6, 2022Updated 3 years ago
- Thai smart home corpus with "Gowajee" hotword☆18Jul 30, 2023Updated 2 years ago
- ☆44Mar 26, 2021Updated 4 years ago
- Java library to tokenize Thai text into a list of TCCs☆19May 30, 2017Updated 8 years ago
- Thai Word Segmentation and Part-of-Speech Tagging with Deep Learning☆40May 26, 2017Updated 8 years ago
- Thai Named Entity Recognition with BiLSTM-CRF using Word/Character Embedding☆17Oct 27, 2019Updated 6 years ago
- Finetune wav2vec2-large-xlsr-53 with Thai Common Voice Corpus 7.0☆51Apr 23, 2022Updated 3 years ago
- ICML 2019. Turn a pre-trained GAN model into a content-addressable model without retraining.☆21Jul 25, 2024Updated last year
- English-Thai Machine Translation Models☆29May 3, 2024Updated last year
- Tesseract OCR tools for read Thai National Document used TH Sarabun National Font trained and fine-tuned. Read README.md to see about my …☆29Dec 5, 2022Updated 3 years ago
- ☆32Jul 13, 2024Updated last year
- Yaitron English-Thai and Thai-English dictionary☆34Oct 13, 2020Updated 5 years ago
- Explainable AI for Software Engineering: A Hands-on Guide on How to Make Software Analytics More Practical, Explainable, and Actionable (…☆27Nov 14, 2021Updated 4 years ago
- Pretraining transformer based Thai language models☆123Nov 6, 2023Updated 2 years ago
- Southeast Asian layout task force☆36May 31, 2025Updated 9 months ago
- Wikidata Live Changes - Group Project - 2020☆10Apr 23, 2024Updated last year
- Code accompanying Coling2020 publication on data augmentation for named entity recognition☆34Aug 4, 2021Updated 4 years ago
- An initiative for Bangkokians to develop contributable open-source projects to solve local problems!☆38Feb 1, 2023Updated 3 years ago
- Thai word segmentation with bi-directional RNN☆83Mar 24, 2023Updated 2 years ago
- Thai_TTS is the project about training "Text to Speech in Thai" using Tacotron2 by NVIDIA.☆34May 24, 2022Updated 3 years ago
- ☆40May 4, 2024Updated last year
- A python library / model for creating co-references between AMR graph nodes.☆11Dec 11, 2022Updated 3 years ago
- The large thai word2vec☆11Nov 16, 2022Updated 3 years ago
- A Fast and Accurate Neural Thai Word Segmenter☆94Jan 14, 2025Updated last year
- Thai Spelling Check☆41Apr 2, 2023Updated 2 years ago
- Moodle ReactJS - gives you ability to use ReactJS inside any moodle page.☆16Apr 28, 2022Updated 3 years ago
- texrex web page cleaning & ClaraX random walk crawler☆11Dec 13, 2021Updated 4 years ago
- ☆12Dec 7, 2022Updated 3 years ago
- Microsoft Power Platform In a Day Workshop☆13Jan 3, 2023Updated 3 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆12Aug 10, 2023Updated 2 years ago