shahules786 / mayavozLinks
Pytorch based speech enhancement toolkit.
β337Updated last year
Alternatives and similar repositories for mayavoz
Users that are interested in mayavoz are comparing it to the libraries listed below
Sorting:
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β137Updated last year
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ254Updated last year
- Performant and accurate speech recognition built on Pytorchβ253Updated 3 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β396Updated last year
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.β358Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ149Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.β320Updated 8 months ago
- General Speech Restorationβ280Updated last year
- On-device voice activity detection (VAD) powered by deep learningβ220Updated last week
- [WIP] VoiceSmith makes training text to speech models easy.β225Updated 2 years ago
- On-device speech-to-text engine powered by deep learningβ457Updated this week
- Desktop application for neural speech synthesis written in C++β215Updated 2 years ago
- Improving transcription performance of OpenAI Whisper for CPU based deploymentβ246Updated 2 years ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokensβ503Updated last year
- Faster Tortoise inference then Tortoise Fast Forkβ128Updated last year
- A live speech recognition using Facebooks wav2vec 2.0 model.β361Updated last year
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogramβ251Updated 11 months ago
- A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)β474Updated last year
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates @ INTERSPEECH 2022β294Updated last year
- Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice β¦β511Updated 2 years ago
- This is the audio sample repository for speech separation model "MossFormer2".β135Updated 7 months ago
- DLAS - A configuration-driven trainer for generative modelsβ140Updated 2 years ago
- A vocal pitch correction web application (like Autotune)β315Updated 2 years ago
- Conformer-based Metric GAN for speech enhancementβ371Updated last year
- Working online speech recognition based on RNN Transducer. ( Trained model release available in release )β293Updated 3 years ago
- generate granular word-level captions in srt formatβ57Updated 2 years ago
- PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speechβ338Updated 3 years ago
- State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI toβ¦β412Updated 2 years ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated textsβ332Updated 8 months ago
- A novel human-interaction method for real-time speech extraction on headphones.β572Updated last year