CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
☆262Jul 15, 2025Updated 7 months ago
Alternatives and similar repositories for cino
Users that are interested in cino are comparing it to the libraries listed below
Sorting:
- ☆17Jun 20, 2017Updated 8 years ago
- [ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)☆31Jan 17, 2026Updated last month
- TIP-LAS: An open source toolkit for Tibetan word segmentation and part-of-speech tagging☆82Nov 11, 2022Updated 3 years ago
- A Python library to add reconstructed pronunciations of Middle Chinese on Chinese texts☆11Mar 13, 2023Updated 2 years ago
- all of tibetan dictionary.ཚོང་ལས་ལས་དོན་དུ་སྤྱོད་མི་ཆོག གལ་སྲིད་འགལ་ན་ཁྲིམས་རྩོད་བྱུང་ངེས།☆15Oct 15, 2023Updated 2 years ago
- 😎 Curated list of Tibetan NLP projects☆43Jul 15, 2020Updated 5 years ago
- A new rime table specially designed for the Qieyun phonology system☆14Apr 17, 2025Updated 10 months ago
- A simple word level tokenizing library and tool for Uyghur language | ئۇيغۇرچە سۆز سۈزۈش كودى ۋە قۇرالى☆22Feb 4, 2014Updated 12 years ago
- Script files of THUYG-20(A free Uyghur speech database Released by CSLT@Tsinghua University & Xinjiang University)☆19Mar 2, 2020Updated 6 years ago
- Speech Recognition for Uyghur using Speech transformer☆28Jun 19, 2021Updated 4 years ago
- asr2k☆52Jun 2, 2024Updated last year
- Chinese Dialect Database☆18Jun 18, 2017Updated 8 years ago
- 🈵 Collected resources to learn/study Manchu (Manchurian Language). 满语滿族満州語入門。☆18Jun 7, 2023Updated 2 years ago
- ExpMRC: Explainability Evaluation for Machine Reading Comprehension☆62Aug 30, 2023Updated 2 years ago
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 2 years ago
- Service for Bert model to Vector. 高效的文本转向量(Text-To-Vector)服务,支持GPU多卡、多worker、多客户端调用,开箱即用。☆12May 24, 2022Updated 3 years ago
- 面向大模型的民族文化数据集☆12May 26, 2025Updated 9 months ago
- ☆12Aug 25, 2017Updated 8 years ago
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- Supervised and unsupervised Concept-based explanation of pretrained music classifiers☆12Jul 27, 2023Updated 2 years ago
- A toolset for computation and comparison of Chinese dialects☆45Feb 15, 2026Updated 3 weeks ago
- 汉语方言字 https://fangyanzi.vercel.app☆24Nov 1, 2022Updated 3 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- Download okCupid users public data automatically☆10Feb 6, 2022Updated 4 years ago
- A TensorFlow implement for "A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding".☆10Jan 22, 2021Updated 5 years ago
- Uyghur Word List☆44Mar 7, 2016Updated 10 years ago
- 粵語對話語料☆29May 12, 2023Updated 2 years ago
- ☆27Aug 31, 2022Updated 3 years ago
- PERT: Pre-training BERT with Permuted Language Model☆367Jul 15, 2025Updated 7 months ago
- A JavaScript library for the Qieyun system☆51Jan 26, 2026Updated last month
- This is the experimental description of MnTTS2.☆11Apr 11, 2024Updated last year
- For audio visualization and playback in Jupyter notebooks.☆17Nov 25, 2025Updated 3 months ago
- ☆12Jan 22, 2017Updated 9 years ago
- unsupervised ASR (mainly phone classifier) using EODM and GAN☆12Oct 22, 2020Updated 5 years ago
- ☆25Feb 12, 2023Updated 3 years ago
- 四川方言语音数据集☆19May 9, 2023Updated 2 years ago
- ☆18Dec 29, 2024Updated last year
- A piano music dataset with Audio, Symbolic and Text labels☆34Mar 6, 2025Updated last year
- Chinese-ASR built on kaldi☆14Jan 21, 2019Updated 7 years ago