Emrys365 / torch_stft
PyTorch-based implementations of short-time Fourier transform
☆15Updated 2 years ago
Alternatives and similar repositories for torch_stft
Users that are interested in torch_stft are comparing it to the libraries listed below
Sorting:
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆17Updated 6 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆27Updated 9 months ago
- ☆10Updated 6 months ago
- ☆16Updated 3 years ago
- The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".☆12Updated 3 years ago
- GPT for FACodec☆13Updated last year
- A toolkit for researchers in the multimodal sound separation.☆16Updated last year
- A STFT/iSTFT written up in PyTorch using 1D Convolutions☆28Updated 10 months ago
- video cut powered by AI☆25Updated 2 years ago
- Phonemes and durations labeling based on whisper small☆11Updated 10 months ago
- Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals☆17Updated 9 months ago
- ESLTTS dataset☆16Updated 3 months ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆33Updated last year
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated 8 months ago
- End-to-End SpeechSynthesis system with fastspeech2 & hifigan☆13Updated 2 years ago
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆12Updated 5 months ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆24Updated 2 years ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆12Updated 9 months ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 3 years ago
- Production-ready vocoder using BigVSAN☆11Updated last year
- speaker-disentangled speech linguistic content quantizer☆14Updated last month
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆12Updated 8 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆19Updated last year
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Updated 2 years ago
- Speech enhancement in noisy and reverberant environments using deep neural networks☆20Updated last month
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆16Updated last month
- Ultrafast GAN based Vocoder for Text to Speech☆50Updated 2 years ago
- ☆20Updated 7 months ago
- Spherical residual vector quantization (SRVQ)☆28Updated 8 months ago