Sangramsingkayte / Speech
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Speech
- Noise removal/ reducer from the audio file in python. De-noising is done using Wavelets and thresholding is done by VISU Shrink threshold…☆173Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆133Updated last year
- A PyTorch demo of the paper Voice Separation with an Unknown Number of Multiple Speakers using gradio and Nvidia NEMO ASR model.☆33Updated 10 months ago
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆204Updated 5 months ago
- Mirror of hf.co/pyannote/speaker-diarization-3.1☆15Updated 10 months ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆326Updated 9 months ago
- The Real time Noise cancellation from Audio data signal . Like the construction noise with the denoising the signal .☆103Updated 2 years ago
- [WIP] VoiceSmith makes training text to speech models easy.☆222Updated 2 years ago
- General Speech Restoration☆276Updated 10 months ago
- ☆35Updated last year
- Whisper combined with Silero VAD, for improved long-form transcriptions☆44Updated last year
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆156Updated 7 months ago
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆251Updated last year
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆222Updated 3 months ago
- A deep neural network architecture for low-latency audio processing☆287Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆66Updated last year
- VoiceBox neural network implementation☆96Updated 3 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆140Updated 6 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆59Updated this week
- an improved version of Real-time-voice-cloning☆45Updated 8 months ago
- Official Implementation of StyleTTS-VC☆164Updated last year
- Official Implementation of StyleTTS☆398Updated 11 months ago
- Open models for Coqui STT☆122Updated last year
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated 8 months ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆282Updated 3 years ago
- Text to speech alignment using CTC forced alignment☆130Updated 2 weeks ago
- Efficient approach to speaker diarization using voice characteristics extraction☆67Updated 6 months ago
- Community framework for training tortoise☆38Updated 2 years ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆158Updated last month
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆27Updated last year