sandy1990418 / ChineseTaiwaneseWhisper
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆25Updated 2 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper:
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
- fine-tune Whipser model for Taiwanese speech recognition☆28Updated last year
- ☆13Updated 4 months ago
- ☆10Updated 2 years ago
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆12Updated last month
- Taiwanese Speech Synthesis with Tacotron2☆19Updated 2 years ago
- ☆41Updated last year
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Updated last year
- Official implementation of MelHuBERT☆64Updated 3 months ago
- LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models☆23Updated 5 months ago
- one script for xls-r/xlsr/whisper fine-tuning☆40Updated last year
- ☆31Updated last year
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆16Updated 2 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆50Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆74Updated last year
- 56 language, 1 model Multilingual ASR☆24Updated 3 years ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Updated 2 years ago
- Prosodic Speech Segmentation with Transformers☆25Updated 11 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆98Updated last year
- Finetuning VITS Efficiently☆32Updated last year
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Updated last year
- ☆12Updated last year
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆49Updated last year
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆18Updated 2 months ago
- Wenet speech to text for react native☆10Updated 2 years ago
- Repository for Accent Recognition (Hackathon @SLT2022)☆25Updated 8 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- open-source Mandarian biased word dataset☆11Updated last year
- ☆11Updated last year
- A toolset for easy formant extraction and visualization from wav files and TTS models☆30Updated 2 years ago
- ☆22Updated 5 years ago