facebookresearch / omnilingual-asrLinks
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
☆1,058Updated this week
Alternatives and similar repositories for omnilingual-asr
Users that are interested in omnilingual-asr are comparing it to the libraries listed below
Sorting:
- ☆322Updated last month
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆289Updated 5 months ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆626Updated 7 months ago
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆234Updated last month
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆294Updated 5 months ago
- G2P☆352Updated 3 months ago
- ☆526Updated last month
- ☆243Updated 5 months ago
- ☆634Updated 3 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆221Updated 5 months ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆836Updated last month
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆190Updated 6 months ago
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models☆812Updated 2 weeks ago
- Collection of Open Source Speech Data☆161Updated last month
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆709Updated 11 months ago
- Make text LLMs listen and speak☆966Updated last week
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆214Updated 6 months ago
- ☆300Updated 2 months ago
- Open Audio Watermarking Tool☆370Updated 4 months ago
- An open-source implementation of Whisper☆455Updated 2 weeks ago
- ☆378Updated last year
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆163Updated 2 weeks ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆343Updated 7 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆505Updated 8 months ago
- Very fast, accurate speaker diarization☆164Updated this week
- Kyutai with an "eye"☆223Updated 7 months ago
- ☆312Updated last year
- Real-time Speech-Text Foundation Model Toolkit (wip)☆247Updated 7 months ago
- Unified automatic quality assessment for speech, music, and sound.☆630Updated 5 months ago
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,322Updated 6 months ago