kaiidams / soundstream-pytorchView external linksLinks
Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint
☆76Updated this week
Alternatives and similar repositories for soundstream-pytorch
Users that are interested in soundstream-pytorch are comparing it to the libraries listed below
Sorting:
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- GPT-style network for phonemization with durations of text☆68Mar 21, 2024Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Streaming Vocos☆29Jun 10, 2025Updated 8 months ago
- This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf☆417Apr 21, 2022Updated 3 years ago
- ☆25Aug 2, 2024Updated last year
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- ☆19Mar 22, 2024Updated last year
- ☆52Jul 16, 2025Updated 6 months ago
- [ICASSP 2024] Official code for FreGrad☆35May 13, 2024Updated last year
- ☆13Sep 12, 2024Updated last year
- ☆11Nov 7, 2024Updated last year
- All generative model in one for better TTS model☆74Sep 8, 2024Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 6 months ago
- ☆25Mar 6, 2024Updated last year
- DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast…☆51Updated this week
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆152Sep 14, 2023Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆55Apr 14, 2025Updated 10 months ago
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆51Jun 11, 2024Updated last year
- unofficial implementation of the High Fidelity Neural Audio Compression☆173Aug 15, 2024Updated last year
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆212Sep 19, 2024Updated last year
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆86Dec 20, 2024Updated last year
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking☆45Aug 23, 2024Updated last year
- An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".☆81Apr 15, 2025Updated 9 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆235Apr 20, 2024Updated last year
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 8 months ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- FNSE-SBGAN: Far-field Speech Enhancement with Schrödinger Bridge and Generative Adversarial Networks☆17May 12, 2025Updated 9 months ago
- ☆49Apr 1, 2025Updated 10 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- Supervoice diffusion enhance☆28Jul 15, 2024Updated last year
- ☆27Sep 5, 2024Updated last year
- An ODE-based generative neural vocoder using Rectified Flow☆58Apr 29, 2023Updated 2 years ago
- ☆43Jan 13, 2025Updated last year
- ☆86May 21, 2023Updated 2 years ago