sandy1990418/ChineseTaiwaneseWhisper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sandy1990418/ChineseTaiwaneseWhisper)

sandy1990418 / ChineseTaiwaneseWhisper

This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, model training, and evaluation, tailored to improve speech recognition accuracy for these languages.

☆72

Alternatives and similar repositories for ChineseTaiwaneseWhisper

Users that are interested in ChineseTaiwaneseWhisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ga642381 / Taiwanese-Whisper
View on GitHub
fine-tune Whipser model for Taiwanese speech recognition
☆37Mar 23, 2023Updated 3 years ago
thc1006 / breeze-asr-taigi
View on GitHub
Taiwanese Hokkien (Taigi) speech-to-text transcriber - MediaTek Breeze-ASR-26 with faster-whisper, tuned for RTX 3050 4GB low-VRAM GPUs. …
☆20Jul 20, 2026Updated last week
adi-gov-tw / Taiwan-Tongues-ASR-CE
View on GitHub
Taiwan Tongues ASR CE 是一個開源語音辨識（Automatic Speech Recognition, ASR）模型專案，專為台灣多元語言環境設計。本模型支援國語、台語、客語與英語，提供本地多語混合語音辨識，讓開發者與資訊服務業者可運用此開源模型進行…
☆64Jun 24, 2026Updated last month
mtkresearch / Breeze-ASR-25
View on GitHub
Breeze ASR 25 是一款先進的自動語音辨識（ASR）模型，基於 Whisper-large-v2 微調而成，特別針對台灣華語以及華語與英語混用的情境進行優化。Breeze ASR 25 is an advanced ASR model fine-tuned fro…
☆184Jul 1, 2025Updated last year
ga642381 / Taiwanese-Translation
View on GitHub
Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus
☆13Oct 15, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ttaoREtw / semi-tts
View on GitHub
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
☆39Jul 16, 2020Updated 6 years ago
ag1988 / mel-asr
View on GitHub
The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…
☆21Oct 11, 2024Updated last year
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago
grtzsohalf / buy_vs_rent_and_invest
View on GitHub
☆15Sep 9, 2021Updated 4 years ago
ga642381 / Taiwanese-Speech-Synthesis
View on GitHub
Taiwanese Speech Synthesis with Tacotron2
☆26Oct 2, 2022Updated 3 years ago
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
ga642381 / Spoken-Dialogue-Model-Survey
View on GitHub
A survey of spoken dialogue models (SDMs) with speech input and speech output. Focus on their Intermediate Representation and Generation …
☆31Mar 24, 2026Updated 4 months ago
voidful / MMLM
View on GitHub
Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra
☆16Dec 10, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
voidful / wav2vec2-xlsr-multilingual-56
View on GitHub
56 language, 1 model Multilingual ASR
☆25Jul 25, 2021Updated 5 years ago
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
ga642381 / SpeechPrompt
View on GitHub
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…
☆102Apr 10, 2025Updated last year
DanielLin94144 / DUAL-textless-SQA
View on GitHub
Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…
☆35Aug 10, 2023Updated 2 years ago
yistLin / beamertheme-yeast
View on GitHub
Yeast, a lite and light beamer theme
☆18Dec 6, 2020Updated 5 years ago
bartlomiej-pluta / android-tts-server
View on GitHub
The Android application providing user with REST-based interface for utilizing built-in Android's TTS engine. The web service is highly c…
☆11Jul 28, 2020Updated 6 years ago
voidful / llm-codec
View on GitHub
LLM-Codec: Neural Audio Codec Meets Language Model Objectives
☆23May 3, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JSALT-2022-SSL / superb-prosody
View on GitHub
☆31Jul 13, 2023Updated 3 years ago
ga642381 / FlappyBird
View on GitHub
Super Flappy Bird in p5.js
☆10Mar 8, 2021Updated 5 years ago
Speech-Lab-IITM / CCC-wav2vec-2.0
View on GitHub
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…
☆23Mar 18, 2024Updated 2 years ago
Deep-unlearning / Llasa-GRPO
View on GitHub
☆18Nov 19, 2025Updated 8 months ago
shang0712 / HierTTS
View on GitHub
☆47Apr 16, 2023Updated 3 years ago
yistLin / universal-vocoder
View on GitHub
A PyTorch implementation of the universal neural vocoder
☆68Nov 6, 2020Updated 5 years ago
MingLunHan / CIF-ColDec
View on GitHub
[ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection
☆25Jul 14, 2026Updated 2 weeks ago
lukaszliniewicz / breath-removal
View on GitHub
Detect and remove or lower the volume of breathing in speech recordings.
☆17May 14, 2025Updated last year
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 4 months ago
duyichao / NPDA-KNN-ST
View on GitHub
Official implementation of EMNLP'2022 paper "Non-Parametric Domain Adaptation for End-to-End Speech Translation"
☆11Oct 26, 2022Updated 3 years ago
ga642381 / AudioCodec-Hub
View on GitHub
AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models
☆25Sep 26, 2023Updated 2 years ago
George0828Zhang / torch_cif
View on GitHub
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/ab…
☆37Feb 10, 2024Updated 2 years ago
roger-tseng / av-superb
View on GitHub
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated 2 years ago
nobel861017 / Conv-TasNet
View on GitHub
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permuta…
☆11Aug 8, 2020Updated 5 years ago
csalt-research / accented-codebooks-asr
View on GitHub
☆19Sep 10, 2024Updated last year