sandy1990418 / ChineseTaiwaneseWhisper
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆19Updated last week
Related projects ⓘ
Alternatives and complementary repositories for ChineseTaiwaneseWhisper
- ☆13Updated last month
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆13Updated last month
- Prosodic Speech Segmentation with Transformers☆23Updated 8 months ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆44Updated last year
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆13Updated last week
- ☆41Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Taiwanese Speech Synthesis with Tacotron2☆18Updated 2 years ago
- fine-tune Whipser model for Taiwanese speech recognition☆27Updated last year
- ☆10Updated last year
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆17Updated last year
- 56 language, 1 model Multilingual ASR☆23Updated 3 years ago
- ☆25Updated 2 years ago
- ☆27Updated 7 months ago
- ASR text preprocessing utility☆20Updated 3 months ago
- Error correction back-end for speaker diarization☆12Updated last year
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆22Updated last year
- Official implementation of MelHuBERT☆63Updated 2 weeks ago
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio☆17Updated last month
- ☆31Updated last year
- ConMamba for Automatic Speech Recognition☆44Updated 3 months ago
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆16Updated 7 months ago
- multilingual speech aligner☆71Updated 11 months ago
- ☆10Updated last year
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆26Updated last year
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 3 years ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Updated last week
- Zero-Shot Foreign Accent Conversion without a Native Reference☆28Updated 6 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆35Updated last month
- Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals☆14Updated 3 months ago