SELMA-project / ml4audio
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆10Updated last year
Alternatives and similar repositories for ml4audio:
Users that are interested in ml4audio are comparing it to the libraries listed below
- Self-supervised neural network for music recommendations.☆18Updated last year
- Lyra V2 (SoundStream) running in the browser☆19Updated last year
- ☆23Updated last year
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated 6 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- A lightweight Python library for running TTS models with a unified API.☆17Updated last month
- Easily turn large sets of audio urls to an audio dataset.☆21Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆26Updated last month
- Minimal, clean code for video/image "patchnization" - a process commonly used in tokenizing visual data for use in a Transformer encoder.…☆11Updated 10 months ago
- ☆16Updated 5 years ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆12Updated last month
- Rust bindings for CTranslate2☆14Updated last year
- A python library to find differences between audio and transcriptions☆16Updated last year
- Simple text to phonemes converter for multiple languages☆20Updated 2 years ago
- Experiments with generating GPT-2 fanfiction on specified topics.☆11Updated 5 years ago
- A 🔥 cookiecutter template for building Hugging Face Spaces☆11Updated 3 years ago
- Simple PyTorch Denoisers for Waveform Audio☆34Updated last month
- A neural network for filtering target speaker's voice from audio written in tensorflow☆21Updated 6 years ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆30Updated 10 months ago
- Audio processing using deep neural networks. Speaker identification using voice embeddings.☆13Updated 2 years ago
- Audio tokenization, in the fastest way possible!☆50Updated 6 months ago
- GPT for FACodec☆13Updated last year
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated last week
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆12Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆19Updated last year
- Supervoice Speaker Separation Network☆12Updated 9 months ago
- 🎹 pyannote + 🗒 notebook = pyannotebook☆26Updated last year
- Contrastive Language-Audio Pretraining☆15Updated 3 years ago
- Audio Demo for "FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation"☆20Updated 3 years ago