This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.
☆71Mar 1, 2025Updated last year
Alternatives and similar repositories for ChineseTaiwaneseWhisper
Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- fine-tune Whipser model for Taiwanese speech recognition☆37Mar 23, 2023Updated 3 years ago
- Breeze ASR 25 是一款先進的自動語音辨識(ASR)模型,基於 Whisper-large-v2 微調而成,特別針對台灣華語以及華語與英語混用的情境進行優化。Breeze ASR 25 is an advanced ASR model fine-tuned fro…☆97Jul 1, 2025Updated 10 months ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Jul 16, 2020Updated 5 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆20Oct 11, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- ☆15Sep 9, 2021Updated 4 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- Taiwanese Speech Synthesis with Tacotron2☆26Oct 2, 2022Updated 3 years ago
- ☆13Sep 25, 2024Updated last year
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆28Nov 12, 2025Updated 5 months ago
- ☆21Mar 3, 2026Updated 2 months ago
- Detect and remove or lower the volume of breathing in speech recordings.☆14May 14, 2025Updated 11 months ago
- ☆47Apr 16, 2023Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A PyTorch implementation of the universal neural vocoder☆67Nov 6, 2020Updated 5 years ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated 11 months ago
- ☆12Mar 23, 2026Updated last month
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Apr 10, 2025Updated last year
- A lightweight Python library for running TTS models with a unified API.☆21Feb 18, 2025Updated last year
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- ☆12Nov 7, 2024Updated last year
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆35Mar 14, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- ☆19Sep 10, 2024Updated last year
- A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permuta…☆11Aug 8, 2020Updated 5 years ago
- text to speech☆10Mar 19, 2024Updated 2 years ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.☆11Feb 17, 2024Updated 2 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Aug 10, 2023Updated 2 years ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 7 months ago
- A Hierarchical Approach for Generating Descriptive Image Paragraphs☆10Mar 27, 2020Updated 6 years ago
- ☆13Mar 7, 2022Updated 4 years ago
- ☆10Feb 16, 2025Updated last year
- ☆31Jul 13, 2023Updated 2 years ago
- ☆44Sep 19, 2024Updated last year