sandy1990418 / ChineseTaiwaneseWhisperView external linksLinks
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆68Mar 1, 2025Updated 11 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- fine-tune Whipser model for Taiwanese speech recognition☆36Mar 23, 2023Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆23Nov 12, 2025Updated 3 months ago
- Taiwanese Speech Synthesis with Tacotron2☆25Oct 2, 2022Updated 3 years ago
- ☆19Sep 10, 2024Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 8 months ago
- ☆11Aug 11, 2023Updated 2 years ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year
- ☆11Nov 7, 2024Updated last year
- ☆10Oct 16, 2025Updated 4 months ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- text to speech☆10Mar 19, 2024Updated last year
- ☆13Sep 25, 2024Updated last year
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.☆11Feb 17, 2024Updated 2 years ago
- ☆15Nov 11, 2024Updated last year
- ☆15Nov 10, 2025Updated 3 months ago
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆15Dec 10, 2024Updated last year
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 6 months ago
- ☆14Aug 1, 2025Updated 6 months ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Sep 2, 2024Updated last year
- DysfluentWFST☆17Nov 13, 2025Updated 3 months ago
- ☆10Sep 19, 2022Updated 3 years ago
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Mar 5, 2024Updated last year
- VITS2 using Phoneme-Level Japanese BERT☆14Dec 17, 2023Updated 2 years ago
- ☆11Oct 14, 2023Updated 2 years ago
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆36Feb 11, 2025Updated last year
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings☆19Jun 6, 2025Updated 8 months ago
- ☆13Mar 7, 2022Updated 3 years ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- ☆15Mar 31, 2025Updated 10 months ago
- ☆15Apr 2, 2025Updated 10 months ago
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- ☆16Jun 22, 2025Updated 7 months ago