ayaka14732 / bert-tokenizer-cantonese
BERT Tokenizer with vocabulary tailored for Cantonese
☆20Updated 2 years ago
Alternatives and similar repositories for bert-tokenizer-cantonese:
Users that are interested in bert-tokenizer-cantonese are comparing it to the libraries listed below
- cantonese-mandarin unsupervised neural translation for sw project☆26Updated last year
- An English-to-Cantonese machine translation model☆49Updated 11 months ago
- An audio and transcribed corpus of contemporary Hong Kong Cantonese☆35Updated 4 years ago
- Transformers for Cantonese☆56Updated 4 years ago
- Unsupervised spoken sentence embeddings☆14Updated 2 years ago
- Phonemes and durations labeling based on whisper small☆11Updated 8 months ago
- ROUGE score calculator with traditional chinese word segmentation☆9Updated 3 years ago
- Taiwanese Speech Synthesis with Tacotron2☆19Updated 2 years ago
- Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" project☆15Updated 2 years ago
- fine-tune Whipser model for Taiwanese speech recognition☆28Updated last year
- Zero-Shot Foreign Accent Conversion without a Native Reference☆30Updated 10 months ago
- 粵語拼音自動標註工具 Cantonese Pronunciation Automatic Labeling Tool☆67Updated 5 months ago
- ROUGE for multilingual Summarization☆23Updated 3 years ago
- Aligner for text-to-speech☆14Updated 8 months ago
- ☆41Updated last year
- Revisiting End-to-End Speech-to-Text Translation From Scratch☆12Updated 2 years ago
- Cantonese segmentation tool 粵語分詞工具☆29Updated 4 years ago
- one script for xls-r/xlsr/whisper fine-tuning☆41Updated last year
- Project of Singing Voice Conversion.☆14Updated last year
- ☆12Updated last year
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆27Updated last year
- Chinese Mandarin Synthesis Corpus-Female/Emotional☆10Updated 7 months ago
- ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET☆59Updated 3 years ago
- Hong Kong Cantonese Corpus of transcribed speech (spontaneous speech, radio programmes and a monologue).☆56Updated last year
- ☆13Updated 5 months ago
- Production-ready vocoder using BigVSAN☆11Updated last year
- 单独维护的中文TTS☆35Updated 2 years ago
- Dataset(MCE) for Developing a Multilingual Dataset and Evaluation Metrics for Code-Switching: A Focus on Hong Kong's Polylingual Dynamics☆22Updated 2 weeks ago
- Chinese polyphone disambiguation for Text-to-Speech application☆31Updated 9 months ago
- 4G GPU & 10 Minutes for train☆12Updated last year