thomas-xin / Encodec-Stream
A lightweight wrapper around https://github.com/facebookresearch/encodec that enables dynamic streamed reading, seeking, metadata and GPU support.
☆11Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for Encodec-Stream
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆30Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆42Updated last month
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆32Updated last year
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆38Updated 2 months ago
- ☆18Updated last year
- ☆32Updated 2 years ago
- Viterbi decoding in PyTorch☆26Updated last month
- Transcribing Speech with Multinomial Diffusion, training code and models.☆75Updated last year
- SelfRemaster: SSL Speech Restoration☆84Updated 10 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- ☆23Updated last year
- ☆32Updated last month
- ☆12Updated 2 months ago
- ☆56Updated last year
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Updated last year
- SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech den…☆43Updated 2 months ago
- SDX23 startkit for the Demucs baselines.☆24Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆87Updated 3 months ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆30Updated 10 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64Updated last year
- ConMamba for Automatic Speech Recognition☆44Updated 2 months ago
- AudioSR-Upsampling (any -> 48kHz)☆38Updated 8 months ago
- PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.☆45Updated 3 weeks ago
- Zero-Shot Emotion Style Transfer☆37Updated 7 months ago
- Collection of scripts from mHuBERT-147.☆22Updated 4 months ago
- Clustering-based methods for overlapping diarization☆68Updated 9 months ago
- My implementation of Vocos for comparison.☆12Updated last year
- A sequence-to-sequence voice conversion toolkit.☆85Updated 4 months ago
- GOMIN; Gaudio Open Mel-spectrogram Inversion Network☆109Updated 9 months ago
- Official implementation of DualCycleGAN for nonparallel audio super resolution☆50Updated 2 years ago