VikhrModels / SaltLinks
☆55Updated this week
Alternatives and similar repositories for Salt
Users that are interested in Salt are comparing it to the libraries listed below
Sorting:
- Open TTS models, built for streaming on the edge☆42Updated 6 months ago
- Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GP…☆30Updated 6 months ago
- ☆140Updated 3 weeks ago
- Audio tokenization, in the fastest way possible!☆53Updated last year
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆127Updated 4 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆218Updated 4 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆123Updated last month
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆15Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆68Updated last week
- ☆62Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- Official repository of Wavehax vocoder☆53Updated last month
- Official implementation of the TTS model Lina-Speech☆168Updated 8 months ago
- Collection of Open Source Speech Data☆160Updated this week
- Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…☆79Updated 2 weeks ago
- A collection of optimized utilities for text-to-audio processing, enhancing both training and inference workflows. This repository contai…☆39Updated 5 months ago
- Framework for processing and filtering datasets☆27Updated last year
- Use quantized versions of Whisper to speed up inference☆12Updated 11 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 4 months ago
- Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"☆163Updated 8 months ago
- Implementation of Strassen attention, from Kozachinskiy et al. of National Center of AI in Chile☆41Updated 2 months ago
- ⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.☆33Updated last year
- ☆40Updated this week
- ☆29Updated 2 weeks ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated last year
- Open-source reproducible benchmarks from Argmax☆58Updated last week
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆84Updated this week
- Normalize Text in Russian☆27Updated last year
- An unofficial PyTorch implementation of VALL-E☆88Updated last month
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆42Updated this week