sandy1990418 / ChineseTaiwaneseWhisperLinks
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆59Updated 8 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenizat…☆98Updated 2 months ago
- Taiwanese Speech Synthesis with Tacotron2☆22Updated 3 years ago
- fine-tune Whipser model for Taiwanese speech recognition☆35Updated 2 years ago
- 56 language, 1 model Multilingual ASR☆25Updated 4 years ago
- ☆95Updated last year
- Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"☆115Updated 4 months ago
- Official release of StyleTalk dataset.☆70Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Updated 7 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Updated 2 years ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆61Updated 2 years ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆96Updated 7 months ago
- PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.☆25Updated 3 years ago
- Code for DeSTA2.5-Audio☆122Updated 3 months ago
- Pre-trained Wav2vec2.0 for Mandarin☆41Updated 3 years ago
- Toolbox for easy and qualitative one-shot voice conversion☆46Updated 3 years ago
- TransferTTS (Zero-Shot learning of VITS)☆101Updated 3 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 3 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Updated 2 years ago
- ☆69Updated last year
- ☆25Updated 3 years ago
- ☆10Updated 3 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated 2 years ago
- CTC decoder with hotwords for ASR.☆34Updated 7 months ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆125Updated last year
- ☆29Updated 3 years ago
- How to use our public wav2vec2 age and gender model☆51Updated 2 years ago
- Fine-Tune Whisper with Transformers and PEFT☆57Updated 2 years ago
- multilingual speech aligner☆77Updated 2 years ago
- Finetuning VITS Efficiently☆33Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆109Updated last year