facebookresearch / dacvaeLinks
DACVAE
☆124Updated this week
Alternatives and similar repositories for dacvae
Users that are interested in dacvae are comparing it to the libraries listed below
Sorting:
- Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GP…☆37Updated 9 months ago
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆298Updated 2 months ago
- ☆62Updated last year
- Kyutai with an "eye"☆230Updated 8 months ago
- ☆317Updated 3 months ago
- ☆218Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆69Updated 2 months ago
- ☆22Updated 4 months ago
- ☆78Updated 7 months ago
- Collection of Open Source Speech Data☆163Updated 2 months ago
- Open TTS models, built for streaming on the edge☆44Updated 9 months ago
- Retrieve the source code for any model made available on replicate.com!☆36Updated last year
- ☆27Updated last year
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆298Updated 6 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Updated 9 months ago
- Safely push a Cog model version by making sure it works and is backwards-compatible with previous versions.☆16Updated 2 weeks ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆130Updated 4 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆224Updated 7 months ago
- ☆62Updated 5 months ago
- ☆19Updated last year
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆133Updated 2 months ago
- FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.☆52Updated last week
- A simple, hackable text-to-speech system in PyTorch and MLX☆184Updated 4 months ago
- ☆24Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 2 months ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- ☆94Updated last year
- Audio tokenization, in the fastest way possible!☆53Updated last year
- The official GitHub Page for MiniMax☆60Updated last month