clement-pages / gryannote
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
☆34Updated this week
Related projects: ⓘ
- ☆61Updated last month
- Joint speech-language model - respond directly to audio!☆29Updated 4 months ago
- This is the official repository of ISMIR 2024 paper "Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional R…☆35Updated last month
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆18Updated 3 months ago
- Audio tokenization, in the fastest way possible!☆43Updated 3 weeks ago
- VoiceBox neural network implementation☆88Updated last month
- A list of language models with permissive licenses such as MIT or Apache 2.0☆21Updated 3 weeks ago
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆51Updated 5 months ago
- Speaker Diarization with Transformers☆57Updated 3 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆74Updated 2 months ago
- Video+code lecture on building nanoGPT from scratch☆64Updated 3 months ago
- Tokun to can tokens☆13Updated this week
- Cog wrapper for collabora/WhisperSpeech☆23Updated 6 months ago
- Sing an idea ➡️ AI music sample🔥🎶☆87Updated 4 months ago
- Create an LJSpeech structured voice dataset on wave input☆16Updated 2 months ago
- Supervoice diffusion enhance☆24Updated 2 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆39Updated 8 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆24Updated 2 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆58Updated 2 weeks ago
- Llama3.1 learns to Listen☆134Updated this week
- The demo page of UniAudio☆34Updated 7 months ago
- Experiments with BitNet inference on CPU☆46Updated 5 months ago
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆41Updated last month
- The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tu…☆65Updated 2 weeks ago
- Efficient approach to speaker diarization using voice characteristics extraction☆56Updated 4 months ago
- Examples of apps built with Nendo, the AI Audio Tool Suite☆55Updated 6 months ago
- VALL-E 2 reproduction☆72Updated 2 months ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆44Updated 10 months ago
- ☆48Updated last month
- Apps that run on modal.com☆12Updated 3 months ago