SELMA-project / ml4audioLinks
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Updated 2 years ago
Alternatives and similar repositories for ml4audio
Users that are interested in ml4audio are comparing it to the libraries listed below
Sorting:
- A testing repo to share code and thoughts on diarisation☆57Updated last year
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆18Updated 2 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆69Updated last month
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated last year
- ☆158Updated 2 years ago
- This is the combined forks of two repos to enable OpenAI Whisper large image with VAD for low VRAM GPUs.☆33Updated 2 years ago
- openvino version of openai/whisper☆178Updated 2 years ago
- Examples of apps built with Nendo, the AI Audio Tool Suite☆55Updated last year
- Real-Time Whisper Voice Recognition with vosk model feedback.☆121Updated 2 years ago
- Code for OpenAI Whisper Web App Demo☆93Updated 3 years ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆99Updated last year
- OCTRA is a web-application for the orthographic transcription of audio files.☆39Updated this week
- 🐍 Coqui's machine learning job scheduler☆31Updated 4 years ago
- A repo with scripts to test and play around with Facebook's recent llama models! 🤗☆28Updated 2 years ago
- A python library to find differences between audio and transcriptions☆19Updated 2 years ago
- Speaker diarization service☆25Updated 5 months ago
- faster-whisper livestream translation, OBS noise reduction, dual language subtitles☆80Updated 2 years ago
- Efficient approach to speaker diarization using voice characteristics extraction☆105Updated 5 months ago
- Coqui AI TTS plugin☆85Updated 5 months ago
- Speaker Diarization with Transformers☆69Updated 6 months ago
- Open TTS models, built for streaming on the edge☆44Updated 8 months ago
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆34Updated 5 years ago
- Create an LJSpeech structured voice dataset on wave input☆36Updated last year
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆51Updated 2 years ago
- ☆19Updated 9 months ago
- Audio processing using deep neural networks. Speaker identification using voice embeddings.☆13Updated 3 years ago
- On-device noise suppression powered by deep learning☆77Updated this week
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆25Updated 2 years ago
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Updated 3 years ago