sandy1990418 / ChineseTaiwaneseWhisperLinks
This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆40Updated 4 months ago
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below
Sorting:
- fine-tune Whipser model for Taiwanese speech recognition☆31Updated 2 years ago
- Taiwanese Speech Synthesis with Tacotron2☆20Updated 2 years ago
- A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenizat…☆81Updated this week
- ☆16Updated last week
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆15Updated 6 months ago
- Official implementation of MelHuBERT☆65Updated 8 months ago
- Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"☆94Updated 4 months ago
- ASR text preprocessing utility☆21Updated 10 months ago
- one script for xls-r/xlsr/whisper fine-tuning☆42Updated 2 years ago
- ☆13Updated 9 months ago
- Pre-trained Wav2vec2.0 for Mandarin☆40Updated 2 years ago
- PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.☆25Updated 2 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Updated last year
- A benchmark corpus for ASR hypothesis revising task☆21Updated last year
- 56 language, 1 model Multilingual ASR☆25Updated 3 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆73Updated 2 years ago
- A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…☆33Updated last year
- ☆10Updated 2 years ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Updated 2 years ago
- A Study of Low-Resource Speech Commands Recognition Based on Adversarial Reprogramming☆19Updated last year
- The official repository of Dynamic-SUPERB.☆183Updated last week
- Generative Fusion Decoding (GFD) is a novel framework for integrating Large Language Models (LLMs) into multi-modal text recognition syst…☆82Updated last month
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆19Updated 2 years ago
- ☆86Updated last year
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆47Updated 2 years ago
- ☆22Updated 5 years ago
- ☆10Updated 4 months ago
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆100Updated 2 months ago
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆160Updated this week