sangramsingnk / Speech
Text-to-Speech Recipe Users can create speech signals from an input text by using text-to-speech (TTS), also referred to as speech synthesis. Popular TTS and Vocoder models, such as Tacotron 2, are supported by SpeechBrain (e.g, HiFIGAN).
☆19Updated 5 months ago
Alternatives and similar repositories for Speech
Users that are interested in Speech are comparing it to the libraries listed below
Sorting:
- Community framework for training tortoise☆41Updated 2 years ago
- A PyTorch demo of the paper Voice Separation with an Unknown Number of Multiple Speakers using gradio and Nvidia NEMO ASR model.☆36Updated last year
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆301Updated 3 years ago
- Desktop application for neural speech synthesis written in C++☆215Updated 2 years ago
- Efficient approach to speaker diarization using voice characteristics extraction☆94Updated last year
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆159Updated 10 months ago
- One Shot Voice Cloning base on Unet-TTS☆242Updated 3 years ago
- an improved version of Real-time-voice-cloning☆50Updated last year
- General Speech Restoration☆277Updated last year
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆230Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- A deep neural network architecture for low-latency audio processing☆301Updated last year
- Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEE…☆188Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆154Updated last year
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆249Updated 9 months ago
- Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.☆360Updated 2 years ago
- ☆130Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆353Updated last year
- open-source audio datasets☆150Updated last year
- Misc. tools/scripts that I made to use for tortoise☆21Updated 9 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated last year
- [WIP] VoiceSmith makes training text to speech models easy.☆224Updated 2 years ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆169Updated last year
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆306Updated last year
- Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.☆214Updated last year
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆325Updated 2 years ago
- This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) wit…☆168Updated 4 years ago
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.☆88Updated 3 years ago
- Noise removal/ reducer from the audio file in python. De-noising is done using Wavelets and thresholding is done by VISU Shrink threshold…☆192Updated 2 years ago
- ☆256Updated last year