yangdongchao/SoundStorm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yangdongchao/SoundStorm)

yangdongchao / SoundStorm

The reproduced code for Google's SoundStorm

☆275

Alternatives and similar repositories for SoundStorm

Users that are interested in SoundStorm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rishikksh20 / SoundStorm-pytorch
View on GitHub
Google's SoundStorm: Efficient Parallel Audio Generation
☆131Aug 8, 2023Updated 2 years ago
0nutation / USLM
View on GitHub
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆152Sep 14, 2023Updated 2 years ago
yangdongchao / AcademiCodec
View on GitHub
AcademiCodec: An Open Source Audio Codec Model for Academic Research
☆674Dec 27, 2023Updated 2 years ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆196Jul 12, 2024Updated 2 years ago
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lifeiteng / SoundStorm
View on GitHub
☆71Jul 13, 2023Updated 3 years ago
X-LANCE / VoiceFlow-TTS
View on GitHub
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
☆376Sep 3, 2024Updated last year
yangdongchao / UniAudio
View on GitHub
The Open Source Code of UniAudio
☆605Jul 22, 2024Updated 2 years ago
facebookresearch / AudioDec
View on GitHub
An Open-source Streaming High-fidelity Neural Audio Codec
☆510Mar 4, 2025Updated last year
innnky / ar-vits
View on GitHub
text to speech using autoregressive transformer and VITS
☆248Apr 3, 2024Updated 2 years ago
X-LANCE / UniCATS-CTX-vec2wav
View on GitHub
[AAAI 2024] Code for CTX-vec2wav in UniCATS
☆130Jun 11, 2024Updated 2 years ago
sony / bigvsan
View on GitHub
Pytorch implementation of BigVSAN
☆203Dec 9, 2025Updated 7 months ago
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆658Jun 9, 2024Updated 2 years ago
ZhangXInFD / soundstorm-speechtokenizer
View on GitHub
Implementation of SoundStorm built upon SpeechTokenizer.
☆116Nov 2, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
adelacvg / NS2VC
View on GitHub
Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech
☆237Feb 29, 2024Updated 2 years ago
zhenye234 / CoMoSpeech
View on GitHub
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
☆214Apr 26, 2024Updated 2 years ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
chomeyama / SiFiGAN
View on GitHub
Official implementation of the source-filter HiFiGAN vocoder
☆275Jul 29, 2023Updated 2 years ago
lucidrains / spear-tts-pytorch
View on GitHub
Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch
☆277Oct 30, 2023Updated 2 years ago
modelscope / FunCodec
View on GitHub
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music gener…
☆445Jan 25, 2024Updated 2 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
adelacvg / ttts
View on GitHub
Train the next generation of TTS systems.
☆169Sep 13, 2024Updated last year
0913ktg / SC_VALL-E
View on GitHub
Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E
☆136Oct 23, 2024Updated last year
lifeiteng / VoiceBox
View on GitHub
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
☆29Aug 4, 2023Updated 2 years ago
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,143Aug 7, 2024Updated last year
ga642381 / SpeechGen
View on GitHub
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆77Jun 9, 2023Updated 3 years ago
X-LANCE / StoryTTS
View on GitHub
[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
☆141Apr 27, 2024Updated 2 years ago
WelkinYang / WaveODE
View on GitHub
An ODE-based generative neural vocoder using Rectified Flow
☆58Apr 29, 2023Updated 3 years ago
yl4579 / HiFTNet
View on GitHub
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
☆257Jan 14, 2025Updated last year
lifeiteng / NaturalSpeech2
View on GitHub
☆33Jun 29, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lakahaga / dc-comix-tts
View on GitHub
Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
☆74Aug 21, 2023Updated 2 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
maum-ai / phaseaug
View on GitHub
ICASSP 2023 Accepted
☆191May 6, 2024Updated 2 years ago
hayeong0 / Diff-HierVC
View on GitHub
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Pr…
☆237Jul 3, 2024Updated 2 years ago
VinAIResearch / XPhoneBERT
View on GitHub
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)
☆354Jul 22, 2024Updated 2 years ago
rishikksh20 / Avocodo-pytorch
View on GitHub
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
☆122Jul 14, 2022Updated 4 years ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year