sandy1990418 / ChineseTaiwaneseWhisper
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆37Updated 2 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- fine-tune Whipser model for Taiwanese speech recognition☆29Updated 2 years ago
- ☆62Updated this week
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆14Updated 5 months ago
- ☆13Updated 7 months ago
- Official implementation of MelHuBERT☆65Updated 6 months ago
- Taiwanese Speech Synthesis with Tacotron2☆19Updated 2 years ago
- Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"☆86Updated 2 months ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆33Updated last year
- ☆10Updated 2 years ago
- 56 language, 1 model Multilingual ASR☆25Updated 3 years ago
- ☆41Updated 2 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Updated last year
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆73Updated last year
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆8Updated 7 months ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆52Updated 2 years ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆22Updated last year
- ASR text preprocessing utility☆21Updated 9 months ago
- ☆16Updated last year
- ☆21Updated 8 months ago
- ☆29Updated last month
- ☆10Updated last year
- CTC decoder with hotwords for ASR.☆20Updated last month
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…☆79Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆99Updated last month
- Generative Fusion Decoding (GFD) is a novel framework for integrating Large Language Models (LLMs) into multi-modal text recognition syst…☆79Updated 8 months ago
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆18Updated 2 years ago
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆19Updated 2 years ago
- A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5☆29Updated last month
- Pre-trained Wav2vec2.0 for Mandarin☆40Updated 2 years ago
- ☆10Updated last year