shenfei1010/CyberCan

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shenfei1010/CyberCan)

shenfei1010 / CyberCan

CyberCan is a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts from discussion forums in Hong Kong.

☆12

Alternatives and similar repositories for CyberCan

Users that are interested in CyberCan are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CanCLID / rime-loengfan
View on GitHub
Loengfan (粵語兩分) is the Cantonese version of the Liang Fen input method
☆15Mar 3, 2022Updated 4 years ago
nk2028 / commonly-used-chinese-characters-and-words
View on GitHub
漢語常用字詞表
☆16Jun 3, 2023Updated 3 years ago
ayaka14732 / cantoseg
View on GitHub
Cantonese segmentation tool 粵語分詞工具
☆31Aug 22, 2020Updated 5 years ago
chainsawriot / rectr
View on GitHub
💒 Reproducible Extraction of Cross-lingual Topics using R
☆20Jul 12, 2023Updated 3 years ago
UserXiaohu / lda-model
View on GitHub
中文文本主题提取，并根据主题，对预测文本进行分类
☆12May 18, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
networkdynamics / humanizr
View on GitHub
☆32Jul 6, 2015Updated 11 years ago
gwinterstein / Cifu
View on GitHub
A frequency lexicon for Hong Kong Cantonese
☆25Aug 27, 2020Updated 5 years ago
ayaka14732 / gpt4-cantonese-english-translator
View on GitHub
A Cantonese-English translator based on prompt engineering
☆12Sep 19, 2023Updated 2 years ago
TypeDuck-HK / TypeDuck-Web
View on GitHub
TypeDuck: Cantonese for everyone at your fingertips
☆23Jul 1, 2026Updated 3 weeks ago
ayaka14732 / bert-tokenizer-cantonese
View on GitHub
BERT Tokenizer with vocabulary tailored for Cantonese
☆23Oct 27, 2022Updated 3 years ago
justinchuntingho / songotsti
View on GitHub
A Package for Cantonese Tokenisation
☆18Jun 17, 2021Updated 5 years ago
ubc / canvas-discussion
View on GitHub
Outputs Canvas discussions as a CSV for specified course.
☆14Feb 27, 2026Updated 4 months ago
schochastics / PSAWR
View on GitHub
R package to interact with the Pushift.io API
☆10Aug 4, 2025Updated 11 months ago
nk2028 / rime-tupa
View on GitHub
Rime TUPA input schema | rime 切韻拼音輸入方案
☆51Feb 12, 2026Updated 5 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
aliyun / stdvga-win-for-qemu
View on GitHub
☆18Jun 30, 2026Updated 3 weeks ago
NGLI / rime-wugniu_soutseu
View on GitHub
蘇州吳語拼音輸入方案 · 苏州吴语拼音输入方案 · A Rime input schema for Suzhou Dialect
☆21May 25, 2026Updated 2 months ago
nk2028 / yitizi
View on GitHub
Input a Chinese character and get all of its variant forms
☆23Apr 13, 2025Updated last year
chaaklau / cantorocks
View on GitHub
a simple html5 jyutping learning game
☆23Nov 25, 2025Updated 8 months ago
Gautamshahi / Misinformation_COVID-19
View on GitHub
Dataset for analysing Propagation of COVID-19 Misinformation on Twitter
☆18Jan 31, 2024Updated 2 years ago
lshk-org / jyutping-table
View on GitHub
電腦用漢字粵語拼音表 / Cantonese Pronunciation List of the Characters for Computers
☆66Jan 11, 2024Updated 2 years ago
Papnas / shupin
View on GitHub
☆23Apr 21, 2022Updated 4 years ago
indiejoseph / hkcc-corpus
View on GitHub
《香港二十世紀中期粵語語料庫》打包器
☆16Apr 12, 2016Updated 10 years ago
CanCLID / canto-filter
View on GitHub
粵文語料篩選器 Cantonese text filter
☆43Feb 4, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
justinchuntingho / LIHKGr
View on GitHub
R Scraper for LIHKG, the Hong Kong version of Reddit.
☆18Nov 24, 2020Updated 5 years ago
toastynews / hong-kong-fastText
View on GitHub
fastText vectors created from Hong Kong data.
☆22Jul 7, 2020Updated 6 years ago
CanCLID / rime-cantonese-schemes
View on GitHub
中州韻粵語拼音輸入法分歧拼音系統補丁 | For users of alternative Cantonese romanisation schemes
☆27Sep 29, 2025Updated 9 months ago
justingrimmer / ModelInference
View on GitHub
Slides and homework for model based inference
☆13Sep 26, 2017Updated 8 years ago
lennylxx / google-input-tools-macos
View on GitHub
Google Input Tools for macOS
☆34Mar 29, 2026Updated 3 months ago
SocratesClub / datascience
View on GitHub
Introduction to Python Programming for Data Science
☆40Oct 3, 2023Updated 2 years ago
sjgiorgi / blm_twitter_corpus
View on GitHub
Corpus of Black Lives Matters and counter protests tweets
☆14Dec 22, 2022Updated 3 years ago
ar0ne / latex-equation-editor
View on GitHub
Simple online editor of math formulas based on LaTeX syntax. Contains table of popular equations and chars for easy work with it to help …
☆10Sep 13, 2019Updated 6 years ago
ayaka14732 / basehangul-online
View on GitHub
Online BaseHangul Encoder And Decoder
☆13Jan 30, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
ztjhz / graphviz-editor
View on GitHub
Generates Graphviz image URL that can be used directly on any website without a need to host them on a server
☆14Feb 21, 2026Updated 5 months ago
wordshk / yue_references
View on GitHub
粵語/廣東話參考資料 Reference Materials for Yue / Cantonese
☆15Dec 12, 2025Updated 7 months ago
philinew / css_images_audio
View on GitHub
Page for the class "Computational Social Science with Images and Audio" at ETH Zurich.
☆13Sep 18, 2025Updated 10 months ago
dlfrnaos19 / tpu-starter-korean
View on GitHub
☆10Oct 21, 2022Updated 3 years ago
simonlindgren / 2wttr
View on GitHub
Get tweets from the v2 Twitter API, using Academic access.
☆19Apr 14, 2021Updated 5 years ago
biopolyhedron / rime-middle-chinese
View on GitHub
中古漢語（切韻音系）全拼及三拼
☆35Mar 26, 2021Updated 5 years ago
cedoard / snscrape_twitter
View on GitHub
Using snscrape and tweepy libraries to scrape unlimited amount of tweets
☆27Mar 1, 2021Updated 5 years ago