evelynkyl / yue_nmt
Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" project
☆14Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for yue_nmt
- cantonese-mandarin unsupervised neural translation for sw project☆24Updated last year
- An audio and transcribed corpus of contemporary Hong Kong Cantonese☆34Updated 3 years ago
- BERT Tokenizer with vocabulary tailored for Cantonese☆20Updated 2 years ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)☆14Updated 2 years ago
- ☆33Updated 3 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆42Updated last year
- A frequency lexicon for Hong Kong Cantonese☆20Updated 4 years ago
- The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models☆24Updated 3 years ago
- Multilingual speech translation☆41Updated 3 years ago
- ☆54Updated last year
- ☆56Updated 2 years ago
- Caucasus languages focused multilingual and monolingual corpuses for Natural Language Processing(NLP)☆33Updated 3 weeks ago
- Spoken Cantonese from Hong Kong.☆29Updated last week
- Cantonese segmentation tool 粵語分詞工具☆29Updated 4 years ago
- PETCI: A Parallel English Translation Dataset of Chinese Idioms☆20Updated 2 years ago
- Revisiting End-to-End Speech-to-Text Translation From Scratch☆12Updated last year
- Code for "Planning and Generating Natural and Diverse Disfluent Texts as Augmentation for Disfluency Detection"☆15Updated 2 years ago
- ☆24Updated 4 years ago
- ASR text preprocessing utility☆20Updated 3 months ago
- SHAS: Approaching optimal Segmentation for End-to-End Speech Translation☆37Updated last year
- ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET☆58Updated 2 years ago
- Code for AAAI 2021 paper "Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance"☆25Updated last year
- Convert English text from written expressions into spoken forms☆21Updated 2 years ago
- ROUGE for multilingual Summarization☆23Updated 3 years ago
- Learning to Rewrite for Non-Autoregressive Neural Machine Translation☆22Updated 2 years ago
- The case study and multilingfual performance of ICASSP submission☆19Updated 2 years ago
- Multilingual sentence alignment using sentence embeddings☆101Updated 3 weeks ago
- Whisper_MCE☆16Updated 5 months ago
- for a paper about leveraging discourse markers for training new models☆9Updated 2 years ago
- Unsupervised spoken sentence embeddings☆14Updated last year