sandy1990418 / ChineseTaiwaneseWhisperLinks
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆53Updated 6 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- Taiwanese Speech Synthesis with Tacotron2☆22Updated 2 years ago
- fine-tune Whipser model for Taiwanese speech recognition☆33Updated 2 years ago
- A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenizat…☆89Updated 2 weeks ago
- 56 language, 1 model Multilingual ASR☆25Updated 4 years ago
- ☆91Updated last year
- PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.☆25Updated 2 years ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Updated 5 months ago
- Toolbox for easy and qualitative one-shot voice conversion☆46Updated 3 years ago
- ☆10Updated 2 years ago
- TransferTTS (Zero-Shot learning of VITS)☆102Updated 2 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Updated 2 years ago
- ☆13Updated 11 months ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆57Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆75Updated 2 years ago
- English conversation corpus for conversational TTS.☆21Updated 2 years ago
- ☆25Updated 3 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 2 years ago
- Official release of StyleTalk dataset.☆69Updated last year
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated 2 years ago
- ASR text preprocessing utility☆21Updated last year
- multilingual speech aligner☆77Updated last year
- ☆41Updated 2 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Updated 2 years ago
- Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"☆113Updated 2 months ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Updated 2 years ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆45Updated 3 years ago
- ☆68Updated last year
- [INTERSPEECH'2022] Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning☆82Updated 2 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆47Updated 2 months ago
- End-to-End Mispronunciation Detection via wav2vec2.0☆48Updated 3 years ago