JiwanSeo / RAQ-VAELinks
Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models
☆13Updated 2 weeks ago
Alternatives and similar repositories for RAQ-VAE
Users that are interested in RAQ-VAE are comparing it to the libraries listed below
Sorting:
- ☆12Updated last year
- Official implementation of INTERSPECCH 2022 Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals☆12Updated last week
- Speech enhancement in noisy and reverberant environments using deep neural networks☆22Updated 3 weeks ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Updated 10 months ago
- Code for INTERSPEECH 2023 paper "mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra"☆61Updated 2 years ago
- DiffPhase: Generative Diffusion-based STFT Phase Retrieval☆16Updated 2 years ago
- Official Code Implementation for 'A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models'☆20Updated last year
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆16Updated last year
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆53Updated 3 months ago
- TensorFlow implementation of "Finite Scalar Quantization: VQ-VAE Made Simple" (ICLR 2024)☆21Updated last year
- [SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder☆44Updated 2 years ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Updated 2 years ago
- An official implementation of "Deep Joint Source-Channel Coding with Iterative Source Error Correction"☆22Updated 2 years ago
- A lightweight audio codec based on a single quantizer☆66Updated last month
- A spoken version of the textual story cloze benchmark☆18Updated 2 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated last year
- ☆38Updated last year
- 宽带波束形成,鄢社峰优化波束书本复现代码(第九章)☆25Updated 2 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆13Updated last year
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆18Updated 7 months ago
- ☆16Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆55Updated 11 months ago
- ☆11Updated 10 months ago
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆17Updated 8 months ago
- SRTNet☆24Updated 2 years ago
- DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023☆57Updated 4 months ago
- PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition☆15Updated 4 years ago
- Transformer based Self-Attention for Complex Numbers☆13Updated 3 years ago
- A neural speech codec based on discrete WavLM representations☆24Updated last year
- ESLTTS dataset☆16Updated 7 months ago