JiwanSeo / RAQ-VAELinks
Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models
☆15Updated 3 months ago
Alternatives and similar repositories for RAQ-VAE
Users that are interested in RAQ-VAE are comparing it to the libraries listed below
Sorting:
- ☆14Updated 2 years ago
- An official implementation of "Deep Joint Source-Channel Coding with Iterative Source Error Correction"☆24Updated 2 years ago
- DiffPhase: Generative Diffusion-based STFT Phase Retrieval☆16Updated 2 years ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆19Updated 10 months ago
- Official Code Implementation for 'A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models'☆20Updated last year
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Updated last year
- Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"☆85Updated 2 years ago
- Official implementation of INTERSPECCH 2022 Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals☆16Updated 3 months ago
- TensorFlow implementation of "Finite Scalar Quantization: VQ-VAE Made Simple" (ICLR 2024)☆21Updated 2 years ago
- A lightweight audio codec based on a single quantizer☆65Updated 4 months ago
- ☆28Updated last year
- A spoken version of the textual story cloze benchmark☆20Updated 2 years ago
- [SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Updated 2 years ago
- Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation☆32Updated last year
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆54Updated 6 months ago
- Code for INTERSPEECH 2023 paper "mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra"☆62Updated 2 years ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆59Updated last year
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆16Updated last year
- ☆16Updated 2 years ago
- Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"☆12Updated last year
- DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023☆59Updated 7 months ago
- PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition☆18Updated 4 years ago
- ☆61Updated last year
- AudioBERT 📢 : Audio Knowledge Augmented Language Model (ICASSP 2025)☆41Updated 10 months ago
- SoFlow: Solution Flow Models for One-Step Generative Modeling☆89Updated last week
- An ODE-based generative neural vocoder using Rectified Flow☆58Updated 2 years ago
- https://arxiv.org/abs/2111.00195☆16Updated 3 years ago
- This is the code of the paper "SpectrumFM: A Foundation Model for Intelligent Spectrum Management"☆22Updated this week
- [ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis☆40Updated 8 months ago
- 宽带波束形成,鄢社峰优化波束书本复现代码(第九章)☆26Updated 2 years ago