SELMA-project / ml4audioLinks
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Updated last year
Alternatives and similar repositories for ml4audio
Users that are interested in ml4audio are comparing it to the libraries listed below
Sorting:
- Audio processing using deep neural networks. Speaker identification using voice embeddings.☆13Updated 2 years ago
- Easily turn large sets of audio urls to an audio dataset.☆21Updated 2 years ago
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆18Updated last year
- Self-supervised neural network for music recommendations.☆18Updated last year
- ☆15Updated 2 months ago
- A lightweight Python library for running TTS models with a unified API.☆18Updated 3 months ago
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated 8 months ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- ☆22Updated last year
- Rust bindings for CTranslate2☆14Updated last year
- Minimal, clean code for video/image "patchnization" - a process commonly used in tokenizing visual data for use in a Transformer encoder.…☆11Updated last year
- GreenLIT: Using GPT-J with Multi-Task Learning to Create New Screenplays☆17Updated 2 years ago
- A 🔥 cookiecutter template for building Hugging Face Spaces☆11Updated 3 years ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆36Updated 2 years ago
- Audio tokenization, in the fastest way possible!☆52Updated 9 months ago
- Experiments and tutorials with and for torchaudio☆13Updated 4 years ago
- Lyra V2 (SoundStream) running in the browser☆18Updated last year
- Code for OpenAI Whisper Web App Demo☆93Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated 2 years ago
- Cog wrapper for collabora/WhisperSpeech☆24Updated last year
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆24Updated 3 weeks ago
- ☆11Updated 9 years ago
- A python library to find differences between audio and transcriptions☆20Updated last year
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆19Updated 7 months ago
- Python text-to-speech library with built-in voice effects and support for multiple TTS engines☆23Updated 2 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated last week
- A project about learning how to synchronize subtitles in movies using machine learning.☆9Updated 2 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆15Updated 4 years ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆13Updated last year
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆15Updated 6 months ago