mhussam-ai / StimulerVoiceX
StimulerVoiceX is a denoising and speech enhancement system. It uses deep learning techniques to remove noise from speech signals and improve their quality and clarity. It can handle various types of noise, such as white noise, babble noise, or environmental noise. It can also enhance speech features, such as volume, pitch, or timbre.
☆11Updated last year
Alternatives and similar repositories for StimulerVoiceX
Users that are interested in StimulerVoiceX are comparing it to the libraries listed below
Sorting:
- Speech enhancement in noisy and reverberant environments using deep neural networks☆20Updated last month
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆12Updated 3 months ago
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆12Updated 5 months ago
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆13Updated 5 months ago
- speaker-disentangled speech linguistic content quantizer☆14Updated last month
- ☆24Updated last week
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆14Updated 3 weeks ago
- This repository contains the source code for the implementation of two deep learning models concerning the audio super resolution task.☆14Updated 2 years ago
- Analysis of XLS-R for Speech Quality Assessment☆13Updated 3 months ago
- Uses machine learning to denoise audio containing speech☆33Updated 10 months ago
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the…☆47Updated 2 years ago
- ☆10Updated 6 months ago
- ☆22Updated 3 years ago
- 'Grad-TTS' with Multilingual Cleaners☆10Updated last year
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆11Updated 9 months ago
- ☆23Updated last year
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 8 months ago
- Finally, some decent sample sentences☆22Updated last year
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated 6 months ago
- Voice activity detection and speaker gender segmentation audiovisual corpus☆13Updated 3 months ago
- Implementation of Emo-StarGAN☆45Updated last year
- Official PyTorch implementation of TTS Style Transfer☆23Updated 2 years ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆16Updated 3 months ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆49Updated 2 months ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆12Updated 9 months ago
- Supervoice Speaker Separation Network☆12Updated 11 months ago
- ☆13Updated 8 months ago
- Codebase and project page for EDMSound☆34Updated last year
- ☆8Updated 8 months ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Updated last year