hlt-mt / mosel
Collection of Open Source Speech Data
☆153Updated 5 months ago
Alternatives and similar repositories for mosel:
Users that are interested in mosel are comparing it to the libraries listed below
- ☆206Updated last month
- ☆359Updated 8 months ago
- ☆62Updated 9 months ago
- Official implementation of the TTS model Lina-Speech☆164Updated 3 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆226Updated last month
- Open TTS models, built for streaming on the edge☆41Updated last month
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆262Updated last month
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆184Updated last week
- ☆255Updated last year
- Audio tokenization, in the fastest way possible!☆51Updated 8 months ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆232Updated 8 months ago
- ☆92Updated this week
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆242Updated last month
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆176Updated 7 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆35Updated last week
- VoiceBox neural network implementation☆106Updated 9 months ago
- ☆285Updated 10 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆205Updated last month
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆162Updated 2 weeks ago
- Joint speech-language model - respond directly to audio!☆30Updated 11 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆70Updated 7 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆141Updated 3 weeks ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 6 months ago
- ☆123Updated 3 weeks ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- The official Implementation of PeriodWave and PeriodWave-Turbo☆188Updated 3 weeks ago
- ☆160Updated this week
- An unofficial PyTorch implementation of VALL-E☆87Updated 2 weeks ago
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆101Updated 2 weeks ago