This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆72Mar 1, 2025Updated last year
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- fine-tune Whipser model for Taiwanese speech recognition☆37Mar 23, 2023Updated 3 years ago
- Breeze ASR 25 是一款先進的自動語音辨識(ASR)模型,基於 Whisper-large-v2 微調而成,特別針對台灣華語以及華語與英語混用的情境進行優化。Breeze ASR 25 is an advanced ASR model fine-tuned fro…☆120Jul 1, 2025Updated 11 months ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Jul 16, 2020Updated 5 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆21Oct 11, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- ☆15Sep 9, 2021Updated 4 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- Taiwanese Speech Synthesis with Tacotron2☆26Oct 2, 2022Updated 3 years ago
- ☆13Sep 25, 2024Updated last year
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆28Nov 12, 2025Updated 7 months ago
- Detect and remove or lower the volume of breathing in speech recordings.☆15May 14, 2025Updated last year
- ☆21Mar 3, 2026Updated 3 months ago
- ☆47Apr 16, 2023Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A PyTorch implementation of the universal neural vocoder☆68Nov 6, 2020Updated 5 years ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated last year
- ☆13Mar 23, 2026Updated 2 months ago
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Apr 10, 2025Updated last year
- A lightweight Python library for running TTS models with a unified API.☆20Feb 18, 2025Updated last year
- Forced alignment decoder for Whisper.☆16Mar 13, 2024Updated 2 years ago
- ☆12Nov 7, 2024Updated last year
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆34Mar 14, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- ☆19Sep 10, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated 2 years ago
- A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permuta…☆11Aug 8, 2020Updated 5 years ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- A complete academic research Skill suite. Supports Claude Code, ChatGPT / Codex CLI, and Gemini CLI.☆91Apr 4, 2026Updated 2 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.☆11Feb 17, 2024Updated 2 years ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Aug 10, 2023Updated 2 years ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆25Oct 8, 2025Updated 8 months ago
- A Hierarchical Approach for Generating Descriptive Image Paragraphs☆10Mar 27, 2020Updated 6 years ago
- ☆13Mar 7, 2022Updated 4 years ago
- ☆10Feb 16, 2025Updated last year
- ☆31Jul 13, 2023Updated 2 years ago