Rakam-AI / rakam_systems
☆13Updated 3 weeks ago
Alternatives and similar repositories for rakam_systems
Users that are interested in rakam_systems are comparing it to the libraries listed below
Sorting:
- Open TTS models, built for streaming on the edge☆41Updated 2 months ago
- ☆49Updated 2 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆14Updated 11 months ago
- ☆29Updated 2 weeks ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆18Updated 6 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 6 months ago
- Audio tokenization, in the fastest way possible!☆52Updated 8 months ago
- Minimalist agent framework for AI engineers☆10Updated last month
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- Collection of scripts from mHuBERT-147.☆24Updated 5 months ago
- Repository for "TESS-2: A Large-Scale, Generalist Diffusion Language Model"☆34Updated 2 months ago
- (WIP) A retrain of F5-TTS on permissively-licensed data☆11Updated last month
- Rust crate for some audio utilities☆23Updated 2 months ago
- ☆20Updated 2 years ago
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆18Updated 3 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated 3 weeks ago
- StyleTTS 2 Optimized Training Fork☆28Updated 3 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆14Updated 2 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 10 months ago
- GPT for FACodec☆13Updated last year
- ☆62Updated 9 months ago
- Open-source and reproducible benchmarks for Speaker Diarization☆24Updated 3 weeks ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆31Updated 9 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 11 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last month
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆36Updated 2 years ago
- A lightweight Python library for running TTS models with a unified API.☆18Updated 2 months ago
- ☆24Updated last week
- ☆84Updated last year