sandy1990418 / ChineseTaiwaneseWhisperLinks
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆63Updated 9 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenizat…☆100Updated 3 months ago
- Taiwanese Speech Synthesis with Tacotron2☆22Updated 3 years ago
- Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"☆119Updated 4 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Updated 8 months ago
- 56 language, 1 model Multilingual ASR☆24Updated 4 years ago
- Official release of StyleTalk dataset.☆70Updated last year
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Updated 2 years ago
- fine-tune Whipser model for Taiwanese speech recognition☆35Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Updated 2 years ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆62Updated 2 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆46Updated 5 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆52Updated 3 years ago
- Code for DeSTA2.5-Audio☆125Updated 4 months ago
- Official implementation of MelHuBERT☆68Updated last year
- ☆69Updated last year
- ☆96Updated last year
- Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.☆89Updated 3 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 3 years ago
- ☆88Updated 4 months ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆124Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 6 months ago
- A PyTorch implementation of the universal neural vocoder☆67Updated 5 years ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆100Updated 8 months ago
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…☆80Updated 2 years ago
- one script for xls-r/xlsr/whisper fine-tuning☆42Updated 2 years ago
- Fine-Tune Whisper with Transformers and PEFT☆58Updated 2 years ago
- Pre-trained Wav2vec2.0 for Mandarin☆41Updated 3 years ago
- ☆13Updated last year
- A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models☆110Updated 2 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆180Updated last year