Rakam-AI / rakam_systems
☆11Updated this week
Related projects ⓘ
Alternatives and complementary repositories for rakam_systems
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆13Updated 3 weeks ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated last year
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆16Updated last week
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- AI town https://github.com/a16z-infra/ai-town Patches to run on Hugging Face Spaces☆19Updated 5 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated 2 weeks ago
- ☆84Updated 7 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 9 months ago
- Audio tokenization, in the fastest way possible!☆45Updated 2 months ago
- Collection of scripts from mHuBERT-147.☆22Updated this week
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- babyLM WhisBERT code☆17Updated 5 months ago
- ☆19Updated last year
- ☆61Updated 3 months ago
- GPT for FACodec☆13Updated 7 months ago
- ☆54Updated this week
- Supervoice diffusion enhance☆24Updated 4 months ago
- ☆23Updated last year
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 5 months ago
- A TTS model that makes a speaker speak new languages☆75Updated 5 months ago
- The demo page of UniAudio☆34Updated 9 months ago
- [ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation☆13Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆11Updated 5 months ago
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆53Updated 7 months ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆29Updated 3 months ago
- ☆32Updated 2 months ago
- Codebase and project page for EDMSound☆29Updated last year
- Implementation of Google's USM speech model in Pytorch☆25Updated last week
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆19Updated 2 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year