gwinterstein/CantoMap

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gwinterstein/CantoMap)

gwinterstein / CantoMap

An audio and transcribed corpus of contemporary Hong Kong Cantonese

☆41

Alternatives and similar repositories for CantoMap

Users that are interested in CantoMap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HLTCHKUST / cantonese-asr
View on GitHub
☆103Feb 1, 2024Updated 2 years ago
fcbond / hkcancor
View on GitHub
Hong Kong Cantonese Corpus of transcribed speech (spontaneous speech, radio programmes and a monologue).
☆95Nov 3, 2025Updated 8 months ago
gwinterstein / Cifu
View on GitHub
A frequency lexicon for Hong Kong Cantonese
☆25Aug 27, 2020Updated 5 years ago
CanCLID / awesome-cantonese-nlp
View on GitHub
A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP
☆95Oct 17, 2021Updated 4 years ago
meganndare / cantonese-nlp
View on GitHub
cantonese-mandarin unsupervised neural translation for sw project
☆29May 2, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
johnwdubois / rezonator
View on GitHub
Rezonator: Dynamics of human engagement
☆34Jul 8, 2026Updated 3 weeks ago
paramiai / cantoformer
View on GitHub
Transformers for Cantonese
☆58Oct 24, 2020Updated 5 years ago
chenchenzi / HKCantonese_models
View on GitHub
This is a repository dedicated for pre-trained acoustic models of Hong Kong Cantonese and Cantonese forced alignment.
☆29Nov 14, 2024Updated last year
ayaka14732 / TransCan
View on GitHub
An English-to-Cantonese machine translation model
☆55Mar 26, 2025Updated last year
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
orphanBB / hnist_oj
View on GitHub
此仓库用于储存湖南理工学院oj上的题解
☆11Oct 7, 2021Updated 4 years ago
lwang114 / GraphUnsupASR
View on GitHub
☆10Apr 17, 2024Updated 2 years ago
chutaklee / CantoASR
View on GitHub
Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)
☆16May 8, 2022Updated 4 years ago
ymgw55 / WSMD
View on GitHub
Improving word mover’s distance by leveraging self-attention matrix (Published in EMNLP 2023 Findings)
☆10Mar 10, 2026Updated 4 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
ayaka14732 / cantoseg
View on GitHub
Cantonese segmentation tool 粵語分詞工具
☆31Aug 22, 2020Updated 5 years ago
0nutation / SLMTokBench
View on GitHub
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Aug 29, 2023Updated 2 years ago
toastynews / hong-kong-fastText
View on GitHub
fastText vectors created from Hong Kong data.
☆22Jul 7, 2020Updated 6 years ago
ayaka14732 / bert-tokenizer-cantonese
View on GitHub
BERT Tokenizer with vocabulary tailored for Cantonese
☆23Oct 27, 2022Updated 3 years ago
notHulK11 / CantoCaptions
View on GitHub
☆48Updated this week
pengcuix / AdaptiveQG
View on GitHub
☆10Jun 1, 2024Updated 2 years ago
voidful / MMLM
View on GitHub
Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra
☆16Dec 10, 2024Updated last year
hon9kon9ize / hkeval2025
View on GitHub
☆22Aug 12, 2025Updated 11 months ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ayaka14732 / gpt4-cantonese-english-translator
View on GitHub
A Cantonese-English translator based on prompt engineering
☆12Sep 19, 2023Updated 2 years ago
dustinfife / fifer
View on GitHub
a collection of R functions for data manipulation, data analysis, and plotting
☆14Oct 29, 2020Updated 5 years ago
LiChaiUSTC / CSL-L2M
View on GitHub
☆18May 4, 2025Updated last year
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
hfhchan / ids
View on GitHub
Ideographic Description Sequences
☆33Nov 27, 2025Updated 8 months ago
VKW2021 / kaldi-baseline
View on GitHub
kaldi cnn-tdnnf baseline
☆13Aug 31, 2021Updated 4 years ago
CanCLID / canto-filter
View on GitHub
粵文語料篩選器 Cantonese text filter
☆43Feb 4, 2026Updated 5 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
currentslab / fastlangid
View on GitHub
fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…
☆43Dec 6, 2022Updated 3 years ago
BUTSpeechFIT / hystoc
View on GitHub
Getting confidences from any end-to-end systems
☆11May 24, 2023Updated 3 years ago
sweetcocoa / crepe-pytorch
View on GitHub
Implementation of CREPE Pitch tracker with PyTorch
☆19Jan 28, 2020Updated 6 years ago
svakulenk0 / conversation_mining
View on GitHub
☆16May 14, 2020Updated 6 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
VITA-Group / Audio-Lottery
View on GitHub
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…
☆32Apr 8, 2022Updated 4 years ago
speechio / asr-noises
View on GitHub
A handy dataset of noises for ASR
☆22May 29, 2019Updated 7 years ago