This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆70Mar 1, 2025Updated last year
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- fine-tune Whipser model for Taiwanese speech recognition☆37Mar 23, 2023Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆25Nov 12, 2025Updated 3 months ago
- Taiwanese Speech Synthesis with Tacotron2☆25Oct 2, 2022Updated 3 years ago
- ☆19Sep 10, 2024Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 9 months ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- ☆13Sep 25, 2024Updated last year
- ☆11Nov 7, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- ☆10Oct 16, 2025Updated 4 months ago
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- ☆15Nov 10, 2025Updated 3 months ago
- Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.☆11Feb 17, 2024Updated 2 years ago
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆15Dec 10, 2024Updated last year
- ☆14Aug 1, 2025Updated 7 months ago
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 6 months ago
- ☆10Sep 19, 2022Updated 3 years ago
- ☆15Nov 11, 2024Updated last year
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated 2 years ago
- DysfluentWFST☆18Nov 13, 2025Updated 3 months ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆37Feb 11, 2025Updated last year
- ☆11Oct 14, 2023Updated 2 years ago
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ☆13Mar 7, 2022Updated 4 years ago
- ☆15Apr 2, 2025Updated 11 months ago
- Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings☆19Jun 6, 2025Updated 9 months ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- ☆15Mar 31, 2025Updated 11 months ago
- ☆16Jun 22, 2025Updated 8 months ago
- ☆31Jul 13, 2023Updated 2 years ago