SELMA-project / ml4audioLinks
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Updated 2 years ago
Alternatives and similar repositories for ml4audio
Users that are interested in ml4audio are comparing it to the libraries listed below
Sorting:
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated last year
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆18Updated 2 years ago
- A testing repo to share code and thoughts on diarisation☆56Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆68Updated 2 weeks ago
- A python library to find differences between audio and transcriptions☆19Updated last year
- ☆19Updated 7 months ago
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Updated 3 years ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆52Updated 4 years ago
- A lightweight Python library for running TTS models with a unified API.☆21Updated 8 months ago
- A PyTorch demo of the paper Voice Separation with an Unknown Number of Multiple Speakers using gradio and Nvidia NEMO ASR model.☆36Updated last year
- Code for OpenAI Whisper Web App Demo☆93Updated 3 years ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆66Updated 11 months ago
- Speaker diarization service☆24Updated 4 months ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆25Updated 2 years ago
- A repo with scripts to test and play around with Facebook's recent llama models! 🤗☆28Updated 2 years ago
- Coqui AI TTS plugin☆87Updated 4 months ago
- Open TTS models, built for streaming on the edge☆43Updated 7 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- Create an LJSpeech structured voice dataset on wave input☆36Updated last year
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- Efficient approach to speaker diarization using voice characteristics extraction☆104Updated 4 months ago
- Audio tokenization, in the fastest way possible!☆53Updated last year
- Wake word detection with custom phrases without model training☆19Updated 2 months ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆28Updated 2 years ago
- ☆157Updated 2 years ago
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆26Updated 2 years ago
- Audio processing using deep neural networks. Speaker identification using voice embeddings.☆13Updated 2 years ago
- Towards Robust Blind Face Restoration with Codebook Lookup Transformer☆33Updated last year
- Real-Time Whisper Voice Recognition with vosk model feedback.☆119Updated 2 years ago
- whisper.cpp bindings for python☆107Updated 2 years ago