JiwanSeo / RAQ-VAEView external linksLinks
Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models
☆15Sep 10, 2025Updated 5 months ago
Alternatives and similar repositories for RAQ-VAE
Users that are interested in RAQ-VAE are comparing it to the libraries listed below
Sorting:
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- Code for Federated Learning for Semantic Communication Edge Networks in Industrial IoT☆16Jul 2, 2023Updated 2 years ago
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆20Jan 3, 2023Updated 3 years ago
- [NeurIPS 2024] Image Understanding Makes for A Good Tokenizer for Image Generation☆22Dec 17, 2024Updated last year
- The official deployment of MambaJSCC in pytorch☆27Sep 10, 2025Updated 5 months ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆20Oct 11, 2024Updated last year
- ☆18Aug 24, 2024Updated last year
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- ☆44Sep 19, 2024Updated last year
- Voice conversion with just linear regression.☆33Sep 25, 2025Updated 4 months ago
- ☆19Mar 22, 2024Updated last year
- ☆52Jun 24, 2025Updated 7 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Jun 1, 2024Updated last year
- ☆48Apr 18, 2023Updated 2 years ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- Pytorch code for IEEE TWC paper "Predictive and Adaptive Deep Coding for Wireless Image Transmission in Semantic Communication"☆176Jun 10, 2024Updated last year
- ☆68Dec 2, 2025Updated 2 months ago
- ☆54Jul 16, 2025Updated 7 months ago
- Extreme Image Compression using Fine-tuned VQGAN Models (DCC 2024)☆23Jan 14, 2025Updated last year
- Source code for research papers about the semantic communication approach SINFONY☆23Feb 4, 2026Updated last week
- Code for Principal Masked Autoencoders☆30Feb 4, 2026Updated last week
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆31Apr 13, 2023Updated 2 years ago
- Official Pytorch Implementation of Our CVPR2023 Paper: "Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image…☆63Jul 21, 2023Updated 2 years ago
- Semantic Communication Systems with Pre-Trained Language Model☆25Oct 28, 2023Updated 2 years ago
- Accent Classification in Speech☆25Jul 24, 2019Updated 6 years ago
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆33Feb 2, 2025Updated last year
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆32Apr 10, 2023Updated 2 years ago
- ☆29Dec 14, 2022Updated 3 years ago
- ☆56Mar 20, 2023Updated 2 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- Viterbi decoding in PyTorch☆40Sep 10, 2025Updated 5 months ago
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- A minimal Pytorch Implementation of Stochastically Quantized Variational AutoEncoder (SQ-VAE) by Sony☆33Oct 16, 2023Updated 2 years ago
- ☆68Jul 29, 2023Updated 2 years ago
- ☆71Jul 13, 2023Updated 2 years ago
- This repository contains the code and instructions needed to reproduce the dataset splits for out paper "Speech Translation for Code-Swit…☆29Apr 8, 2022Updated 3 years ago