haydenshively/SoundStream

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/haydenshively/SoundStream)

haydenshively / SoundStream

Implementation of SoundStream, an end-to-end neural audio codec

☆33

Alternatives and similar repositories for SoundStream

Users that are interested in SoundStream are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kyegomez / SoundStream
View on GitHub
Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"
☆13Jan 27, 2025Updated last year
kaiidams / soundstream-pytorch
View on GitHub
Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint
☆82Feb 9, 2026Updated 5 months ago
wesbz / SoundStream
View on GitHub
This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf
☆431Apr 21, 2022Updated 4 years ago
AI-Hypercomputer / ray-tpu
View on GitHub
☆15May 11, 2025Updated last year
primepake / learnable-speech
View on GitHub
This repo is text to speech with learnable audio encoder without alignment with transcript reference
☆54Sep 20, 2025Updated 10 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
BJTUSensor / CS_KSVD
View on GitHub
This code aims to reconstruct the original BGS by using a compressed sensing method based on K-SVD algorithm.
☆10Oct 6, 2022Updated 3 years ago
samsad35 / code-ancogen
View on GitHub
[ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
☆14Mar 11, 2025Updated last year
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 8 months ago
nguyenvulebinh / AV-HuBERT-S2S
View on GitHub
Huggingface Implementation of AV-HuBERT on the MuAViC Dataset
☆19Mar 6, 2025Updated last year
samsad35 / VQ-MAE-S-code
View on GitHub
[ICASSPW] A Vector Quantized Masked AutoEncoder for speech emotion recognition
☆30Mar 4, 2024Updated 2 years ago
CodingVillainKor / SimpleDeepLearning
View on GitHub
Simple Deep learning projects
☆18May 20, 2026Updated 2 months ago
ZhikangNiu / encodec-pytorch
View on GitHub
unofficial implementation of the High Fidelity Neural Audio Compression
☆176Aug 15, 2024Updated last year
divetoh / tranquility
View on GitHub
Web-based Personal Information Manager (PIM). Python, FastAPI, PostgreSQL, VUE, Quasar.
☆15Sep 29, 2022Updated 3 years ago
kyutai-labs / kaudio
View on GitHub
Rust crate for some audio utilities
☆32Jun 17, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
missuo / ClaudeProxy
View on GitHub
Proxy for Anthropic Claude implemented in Go
☆13Mar 9, 2024Updated 2 years ago
google-research-datasets / LLAMA1-Test-Set
View on GitHub
We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…
☆23Mar 14, 2024Updated 2 years ago
P2Oileen / CitationHelper
View on GitHub
Google Scholar自搜小脚本，每次开启命令行即显示当前citation。Small Script displaying current citation count each time the shell is opened.
☆21Mar 3, 2025Updated last year
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
CarlWangChina / MuChin
View on GitHub
MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music
☆27Jan 7, 2026Updated 6 months ago
FLwolfy / InnoEngine
View on GitHub
A simple cross-platform game engine based on .NET framework.
☆16Jul 17, 2026Updated last week
interfas24 / RAAnalysis-py
View on GitHub
RAAnalysis tool python version
☆12Apr 16, 2019Updated 7 years ago
JanWilczek / fdaf-double-talk-detector
View on GitHub
Frequency-Dependent Adaptive Filtering Double Talk Detector.
☆13Mar 26, 2020Updated 6 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
jiang-du / Perceptual-CS
View on GitHub
Official code for papers "Perceptual Compressive Sensing" at PRCV 2018 and "Fully Convolutional Measurement Network for Compressive Sensi…
☆18Aug 6, 2019Updated 6 years ago
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆27Jun 6, 2026Updated last month
GravityPoet / textream-zh
View on GitHub
一款隐身于 Mac 摄像头下方的智能提词器：专为视频录制、直播与会议设计，帮您保持自然眼神交流。支持苹果自带语音识别与本地 AI 大模型，能随着您的真实语速自动跟踪和滚动文案，彻底告别忘词与手动滑屏的烦恼。
☆19Feb 24, 2026Updated 5 months ago
haoheliu / SemantiCodec-inference
View on GitHub
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
☆255Mar 7, 2025Updated last year
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
xiquan-li / MeanAudio
View on GitHub
[ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
☆142Sep 2, 2025Updated 10 months ago
Ubenwa / cryceleb2023
View on GitHub
☆12Mar 18, 2024Updated 2 years ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
LeeKeyu / abdominal_ultrasound_classification
View on GitHub
Combining deep neural networks with PCA and k-NN classification for abdominal organ recognition in ultrasound images.
☆28Oct 12, 2021Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
yoyoberenguer / SoundEffectLibrary
View on GitHub
Sound effect library
☆10Mar 22, 2021Updated 5 years ago
shun60s / Vocal-Tube-Model
View on GitHub
a very simple vocal tract model, few tube model. generate vowel sound by it
☆18Jun 27, 2026Updated 3 weeks ago
google / df-conformer
View on GitHub
Audio samples accompanying publications related to DF-Conformer, a speech enhancement model.
☆36Jun 23, 2026Updated last month
media-sec-lab / ViT-VAE
View on GitHub
☆30Mar 3, 2023Updated 3 years ago
193746 / VHASR
View on GitHub
☆11Oct 31, 2024Updated last year
AleksandarHaber / Simulation-of-State-Space-Models-of-Dynamical-Systems-in-Cpp--Eigen-Matrix-Library-Tutorial
View on GitHub
☆16Apr 10, 2026Updated 3 months ago
viewfinder-annn / AnyEnhance-v1
View on GitHub
AnyEnhance-based Baseline for the CCF-AATC 2025 Challenge Track 1
☆64May 21, 2026Updated 2 months ago