chomeyama / wavehaxLinks
Official repository of Wavehax vocoder
☆63Updated last week
Alternatives and similar repositories for wavehax
Users that are interested in wavehax are comparing it to the libraries listed below
Sorting:
- Unofficial implementation of wavenext vocoder☆53Updated last year
- ☆51Updated 6 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆17Updated 7 months ago
- ☆44Updated last year
- [ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆94Updated 11 months ago
- ☆48Updated 5 months ago
- A lightweight audio codec based on a single quantizer☆65Updated 4 months ago
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆116Updated 3 months ago
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆61Updated 3 months ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆57Updated 2 years ago
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆51Updated last year
- The open source code for SimpleSpeech series☆143Updated last year
- ☆19Updated last year
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆124Updated 9 months ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Updated last year
- Sequence alignement methods with helpers for PyTorch.☆24Updated 3 years ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆51Updated 9 months ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50Updated 7 months ago
- ☆70Updated last year
- A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.☆111Updated 6 months ago
- Voice conversion with just linear regression.☆32Updated 3 months ago
- ☆81Updated 5 months ago
- Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"☆102Updated last week
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆113Updated last year
- Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…☆93Updated 2 months ago
- Training code and trained checkpoints for ASGAN.☆62Updated 2 years ago
- ☆61Updated last year
- My vocoder experiments☆31Updated 5 months ago
- ☆103Updated 2 months ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆75Updated 3 weeks ago