anton-jeran/MULTI-AUDIODEC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/anton-jeran/MULTI-AUDIODEC)

anton-jeran / MULTI-AUDIODEC

This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.

☆54

Alternatives and similar repositories for MULTI-AUDIODEC

Users that are interested in MULTI-AUDIODEC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
XZWY / SpatialCodec
View on GitHub
Implementation of SpatialCodec.
☆71Sep 23, 2023Updated 2 years ago
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆53May 1, 2025Updated last year
haiciyang / LaDiffCodec
View on GitHub
ICASSP 2024 - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.
☆56Nov 16, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
anton-jeran / IR-GAN
View on GitHub
Augmenting Room Impulse Response
☆43Sep 15, 2023Updated 2 years ago
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago
xi-j / Style-Talker
View on GitHub
An official implementation of Style-Talker for Spoken Dialogue Generation
☆23Jan 12, 2025Updated last year
speechnovateur / languagecodec_tmp
View on GitHub
Temporary anonymous version
☆22Mar 20, 2024Updated 2 years ago
Yip-Jia-Qi / codecformer
View on GitHub
☆21Jul 15, 2024Updated 2 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
qiuk2 / AAR
View on GitHub
[Official Implementation] Acoustic Autoregressive Modeling 🔥
☆75Aug 24, 2024Updated last year
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
0nutation / SLMTokBench
View on GitHub
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Aug 29, 2023Updated 2 years ago
ahogg / HRTF-upsampling-with-a-generative-adversarial-network-using-a-gnomonic-equiangular-projection
View on GitHub
☆12Nov 1, 2024Updated last year
reppy4620 / vocoders
View on GitHub
My vocoder experiments
☆31Jul 26, 2025Updated 11 months ago
AudiogenAI / agc
View on GitHub
Audiogen Codec
☆146Jul 9, 2024Updated 2 years ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
anton-jeran / MESH2IR
View on GitHub
This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D…
☆110Jul 24, 2024Updated last year
ryota-komatsu / speech_resynth
View on GitHub
Speech Resynthesis and Language Modeling
☆27Jun 11, 2025Updated last year
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
AlanBaade / SyllableLM
View on GitHub
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
☆63Jul 1, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
haoheliu / SemantiCodec-inference
View on GitHub
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
☆254Mar 7, 2025Updated last year
introlab / uimvdr
View on GitHub
☆13Oct 11, 2024Updated last year
asappresearch / simple-tts
View on GitHub
Contains the code associated with the ICLR submission for our text-to-speech diffusion model
☆57Oct 31, 2023Updated 2 years ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
exercise-book-yq / Supercodec
View on GitHub
☆51Mar 5, 2026Updated 4 months ago
Respaired / RiFornet_Vocoder
View on GitHub
a Neural Vocoder supporting Ring Attention, Conformer and NSF.
☆25Aug 1, 2025Updated 11 months ago
WangHelin1997 / Fast-GeCo
View on GitHub
Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction
☆50Nov 19, 2024Updated last year
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated last week
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
LAION-AI / Text-to-speech
View on GitHub
☆61Nov 4, 2023Updated 2 years ago
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆62Oct 23, 2024Updated last year
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 6 months ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
sp-uhh / storm
View on GitHub
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
☆255Sep 13, 2024Updated last year
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago