Rakam-AI / rakam_systems
☆13Updated 2 months ago
Alternatives and similar repositories for rakam_systems:
Users that are interested in rakam_systems are comparing it to the libraries listed below
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆12Updated 5 months ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated last week
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆16Updated 3 months ago
- StyleTTS 2 Optimized Training Fork☆22Updated 2 weeks ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 3 months ago
- Collection of scripts from mHuBERT-147.☆24Updated 3 months ago
- Audio tokenization, in the fastest way possible!☆48Updated 5 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 8 months ago
- Dippy Synthetic Speech Subnet☆15Updated this week
- Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…☆15Updated this week
- ☆12Updated 5 months ago
- ☆15Updated 2 years ago
- AI town https://github.com/a16z-infra/ai-town Patches to run on Hugging Face Spaces☆19Updated 8 months ago
- A HuggingFace compatible Small Language Model trainer.☆74Updated 2 weeks ago
- Code from blog 'Searching by Music: Leveraging Vector Search for Music Information Retrieval'☆16Updated last year
- Simple PyTorch Denoisers for Waveform Audio☆34Updated 2 months ago
- ☆84Updated 10 months ago
- ☆39Updated 3 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Using short models to classify long texts☆21Updated last year
- ☆9Updated 4 months ago
- babyLM WhisBERT code☆18Updated 8 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆22Updated last year
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- ☆19Updated last year
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆29Updated 6 months ago
- A lightweight Python library for running TTS models with a unified API.☆16Updated this week