UniversalDependencies/UD_Cantonese-HK

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UniversalDependencies/UD_Cantonese-HK)

UniversalDependencies / UD_Cantonese-HK

Spoken Cantonese from Hong Kong.

☆30

Alternatives and similar repositories for UD_Cantonese-HK

Users that are interested in UD_Cantonese-HK are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UniversalDependencies / UD_Chinese-HK
View on GitHub
Spoken mandarin Chinese from Hong Kong.
☆13May 6, 2026Updated 2 months ago
gwinterstein / Cifu
View on GitHub
A frequency lexicon for Hong Kong Cantonese
☆25Aug 27, 2020Updated 5 years ago
CanCLID / awesome-cantonese-nlp
View on GitHub
A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP
☆95Oct 17, 2021Updated 4 years ago
paramiai / cantoformer
View on GitHub
Transformers for Cantonese
☆58Oct 24, 2020Updated 5 years ago
ayaka14732 / cantoseg
View on GitHub
Cantonese segmentation tool 粵語分詞工具
☆31Aug 22, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
toastynews / electra-hongkongese
View on GitHub
Pre-trained ELECTRA from Hong Kong data
☆29Jul 7, 2020Updated 6 years ago
esantus / EVALution
View on GitHub
Dataset containing Semantic Relations and Metadata, for Training and Evaluating Distributional Semantic Models in English and Mandarin Ch…
☆16Aug 7, 2017Updated 8 years ago
toastynews / hong-kong-fastText
View on GitHub
fastText vectors created from Hong Kong data.
☆22Jul 7, 2020Updated 6 years ago
smosanu / trie-python-graphviz
View on GitHub
This is an Object Oriented implementation of a Trie in python. The class contains setter and getter methods, and implements several usefu…
☆15Jan 9, 2018Updated 8 years ago
linguistica-uchicago / lxa5
View on GitHub
Linguistica 5: Unsupervised Learning of Linguistic Structure
☆32Jun 9, 2026Updated last month
jacksonllee / pycantonese
View on GitHub
Cantonese Linguistics and NLP
☆413May 26, 2026Updated 2 months ago
coltekin / turkish-nlp-resources
View on GitHub
Corpora, tools and resources for Turkish NLP
☆14May 27, 2020Updated 6 years ago
justinchuntingho / songotsti
View on GitHub
A Package for Cantonese Tokenisation
☆18Jun 17, 2021Updated 5 years ago
CodeYourFuture / Table-of-Contents
View on GitHub
Where everything is on this gigantic messy org account
☆17Dec 29, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jyutnet / cantonese-books-data
View on GitHub
粵音資料集叢：典籍資料
☆245Updated this week
UniversalDependencies / UD_Chinese-CFL
View on GitHub
Chinese as a foreign language.
☆14May 6, 2026Updated 2 months ago
CanCLID / canto-filter
View on GitHub
粵文語料篩選器 Cantonese text filter
☆43Feb 4, 2026Updated 5 months ago
spyysalo / conllu.py
View on GitHub
CoNLL-U format library for Python
☆15Apr 7, 2015Updated 11 years ago
lmorgadodacosta / CantoneseWN
View on GitHub
The Cantonese Wordnet
☆15Dec 4, 2023Updated 2 years ago
HLTCHKUST / cantonese-asr
View on GitHub
☆103Feb 1, 2024Updated 2 years ago
apertium / apertium-kir
View on GitHub
Apertium linguistic data for Kyrgyz
☆17Jul 6, 2026Updated 3 weeks ago
ayaka14732 / lihkg-scraper
View on GitHub
A Python script for scraping LIHKG
☆32Mar 7, 2022Updated 4 years ago
ray1007 / GWE
View on GitHub
☆31Jun 2, 2018Updated 8 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ichitenfont / suppchara
View on GitHub
常用香港外字表
☆56Sep 7, 2022Updated 3 years ago
hfst / compmorph-course
View on GitHub
Jupyter notebooks for course "Computational Morphology with HFST".
☆21Oct 5, 2022Updated 3 years ago
lwang114 / GraphUnsupASR
View on GitHub
☆10Apr 17, 2024Updated 2 years ago
HKUST-KnowComp / JWE
View on GitHub
Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components
☆100Jun 21, 2019Updated 7 years ago
UniversalDependencies / UD_English-EWT
View on GitHub
English data
☆231Updated this week
alvations / NTU-MC
View on GitHub
Nanyang Technological University - Multilingual Corpus (STB subcorpora)
☆12Mar 11, 2019Updated 7 years ago
fbkarsdorp / alignment
View on GitHub
Simple Python library for doing (multiple) sequence alignment
☆17Jun 24, 2018Updated 8 years ago
siuying / cantonese-syllables
View on GitHub
Scrape cantonese syllables from CUHK Multi-function Chinese Character Database.
☆11Mar 18, 2015Updated 11 years ago
kan-bayashi / INTERSPEECH19_TUTORIAL
View on GitHub
Interspeech 2019 tutorial materials
☆49Sep 26, 2019Updated 6 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JFChi / PLUE
View on GitHub
☆11May 25, 2023Updated 3 years ago
chatopera / text-cfg-parser
View on GitHub
自然语言处理之CFG句法分析
☆10Mar 27, 2018Updated 8 years ago
voidful / Phraseg
View on GitHub
Phraseg - 一言：新詞發現工具包
☆26Nov 30, 2021Updated 4 years ago
shenfei1010 / CyberCan
View on GitHub
CyberCan is a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts from discussion forums in Hong Ko…
☆12Aug 24, 2021Updated 4 years ago
evelynkyl / yue_nmt
View on GitHub
Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" project
☆16Oct 28, 2022Updated 3 years ago
indiejoseph / hkcc-corpus
View on GitHub
《香港二十世紀中期粵語語料庫》打包器
☆16Apr 12, 2016Updated 10 years ago
pranav-ust / 2kenize
View on GitHub
Upcoming ACL 2020 paper
☆26May 8, 2020Updated 6 years ago