ayaka14732 / TransCan
An English-to-Cantonese machine translation model
☆49Updated last year
Alternatives and similar repositories for TransCan:
Users that are interested in TransCan are comparing it to the libraries listed below
- BERT Tokenizer with vocabulary tailored for Cantonese☆20Updated 2 years ago
- 粵文語料篩選器 Cantonese text filter☆38Updated last month
- Transformers for Cantonese☆56Updated 4 years ago
- A Cantonese-English translator based on prompt engineering☆11Updated last year
- 粵語拼音自動標註工具 Cantonese Pronunciation Automatic Labeling Tool☆68Updated 6 months ago
- cantonese-mandarin unsupervised neural translation for sw project☆26Updated last year
- rime-cantonese 上游詞表倉庫☆27Updated 7 months ago
- A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP☆87Updated 3 years ago
- Hong Kong Cantonese Corpus of transcribed speech (spontaneous speech, radio programmes and a monologue).☆56Updated last year
- An audio and transcribed corpus of contemporary Hong Kong Cantonese☆35Updated 4 years ago
- A Python script for scraping LIHKG☆30Updated 3 years ago
- 粵語對話語料☆24Updated last year
- A frequency lexicon for Hong Kong Cantonese☆21Updated 4 years ago
- JAX implementation of the bart-base model☆30Updated last year
- ☆80Updated last year
- Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" project☆15Updated 2 years ago
- Cantonese segmentation tool 粵語分詞工具☆29Updated 4 years ago
- 台語、族語、客語的語料清單、彙整☆40Updated 4 years ago
- 電腦用漢字粵語拼音表 / Cantonese Pronunciation List of the Characters for Computers☆54Updated last year
- 臺灣閩南語常用詞辭典 資料檔☆77Updated last year
- Cantonese Linguistics and NLP☆374Updated 10 months ago
- Loengfan (粵語兩分) is the Cantonese version of the Liang Fen input method☆12Updated 3 years ago
- Taiwanese Hokkien Transliterator and Tokeniser☆29Updated 6 months ago
- one script for xls-r/xlsr/whisper fine-tuning☆41Updated last year
- A toolset for computation and comparison of Chinese dialects☆35Updated this week
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆13Updated 3 months ago
- Grapheme-to-Phoneme lexicons for Chinese dialects☆67Updated 2 years ago
- 粵語正字法☆13Updated 4 years ago
- 渊 - A project for Classical Chinese☆99Updated 3 years ago
- Multilingual sentence alignment using sentence embeddings☆113Updated 4 months ago