kuan2jiu99/Awesome-Speech-Generation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kuan2jiu99/Awesome-Speech-Generation)

kuan2jiu99 / Awesome-Speech-Generation

Survey on speech generation work.

☆21

Alternatives and similar repositories for Awesome-Speech-Generation

Users that are interested in Awesome-Speech-Generation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

0nutation / SLMTokBench
View on GitHub
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Aug 29, 2023Updated 2 years ago
huckiyang / awesome-neural-reprogramming-prompting
View on GitHub
A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022
☆40Nov 30, 2023Updated 2 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
ga642381 / Speech-Prompts-Adapters
View on GitHub
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
☆113Aug 4, 2023Updated 2 years ago
speechnovateur / languagecodec_tmp
View on GitHub
Temporary anonymous version
☆22Mar 20, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
dihardchallenge / dihard3_baseline
View on GitHub
☆30Jul 21, 2022Updated 3 years ago
mutiann / neural-lexicon-reader
View on GitHub
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
☆21Jul 25, 2022Updated 3 years ago
declare-lab / HyperTTS
View on GitHub
☆40Apr 15, 2024Updated 2 years ago
Aria-K-Alethia / speaking-rate-controllable-hifi-gan
View on GitHub
☆16Apr 4, 2022Updated 4 years ago
WangHelin1997 / SpeechTasks
View on GitHub
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…
☆83Jun 7, 2024Updated 2 years ago
Takaaki-Saeki / DiscreteSpeechMetrics
View on GitHub
Reference-aware automatic speech evaluation toolkit
☆183Dec 5, 2024Updated last year
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated 2 weeks ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
roger-tseng / CodecFake
View on GitHub
A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024
☆22Jul 27, 2024Updated last year
ShovalMessica / NAST
View on GitHub
Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…
☆46Jul 2, 2024Updated 2 years ago
cuhealthybrains / MT-LLM
View on GitHub
The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"
☆50Apr 7, 2025Updated last year
lourson1091 / audiobertscore
View on GitHub
☆15Nov 10, 2025Updated 8 months ago
HSUNEH / DOSE
View on GitHub
☆19Sep 22, 2025Updated 9 months ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
QxLabIreland / AQP
View on GitHub
☆24Jun 13, 2022Updated 4 years ago
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
redmist328 / APNet2
View on GitHub
Source code of APNet2, a vocoder
☆60Nov 23, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
declare-lab / speech-adapters
View on GitHub
Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech und…
☆43Mar 12, 2023Updated 3 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
cwang621 / blsp-emo
View on GitHub
BLSP-Emo: Towards Empathetic Large Speech-Language Models
☆61Jun 7, 2024Updated 2 years ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
haoheliu / SemantiCodec-inference
View on GitHub
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
☆254Mar 7, 2025Updated last year
liusongxiang / Large-Audio-Models
View on GitHub
Keep track of big models in audio domain, including speech, singing, music etc.
☆515Jul 3, 2026Updated 2 weeks ago
tincans-ai / gazelle-inference
View on GitHub
proof of concept conversation orchestrator with a speech-language model
☆20Oct 19, 2024Updated last year
ga642381 / speech-trident
View on GitHub
Awesome speech/audio LLMs, representation learning, and codec models
☆1,239Jul 10, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
nervjack2 / MelHuBERT
View on GitHub
Official implementation of MelHuBERT
☆70Feb 21, 2026Updated 4 months ago
Beilong-Tang / lauraTSE_code
View on GitHub
Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.
☆37Nov 9, 2025Updated 8 months ago
7Xin / DPI-TTS
View on GitHub
☆13Sep 12, 2024Updated last year
qiuk2 / AAR
View on GitHub
[Official Implementation] Acoustic Autoregressive Modeling 🔥
☆75Aug 24, 2024Updated last year
SLPcourse / Singing-Voice-Conversion
View on GitHub
Project of Singing Voice Conversion.
☆16Oct 27, 2023Updated 2 years ago
linshuqing / NoteRepo-remote-github
View on GitHub
☆25Oct 15, 2025Updated 9 months ago
eurecom-asp / rawnet2-antispoofing
View on GitHub
This repository includes the code to reproduce our paper "End-to-end anti-spoofing with RawNet2" (https://arxiv.org/abs/2011.01108) publi…
☆70Aug 8, 2023Updated 2 years ago