facebookresearch / omnilingual-asrLinks
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
☆2,620Updated last month
Alternatives and similar repositories for omnilingual-asr
Users that are interested in omnilingual-asr are comparing it to the libraries listed below
Sorting:
- Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music…☆756Updated this week
- Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.☆2,552Updated 2 weeks ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆963Updated 4 months ago
- Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, im…☆3,366Updated 3 weeks ago
- Soprano: Instant, Ultra-Realistic Text-to-Speech☆1,137Updated 3 weeks ago
- TTS model capable of streaming conversational audio in realtime.☆1,027Updated 2 months ago
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,368Updated 9 months ago
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆800Updated 3 months ago
- The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trai…☆3,256Updated last month
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,822Updated last week
- Interface for OuteTTS models.☆1,421Updated 7 months ago
- Make text LLMs listen and speak☆1,152Updated last week
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆651Updated 2 weeks ago
- ☆511Updated this week
- G2P☆400Updated 5 months ago
- A TTS that fits in your CPU (and pocket)☆2,683Updated last week
- A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics…☆836Updated last week
- ☆1,249Updated last week
- Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching☆830Updated 2 months ago
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆388Updated last week
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆925Updated last year
- ☆536Updated 4 months ago
- GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters☆724Updated last month
- On-device TTS model by Neuphonic☆4,718Updated 3 weeks ago
- Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.☆789Updated last week
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆5,715Updated last week
- Real Time Speech Transcription with FastRTC ⚡️and Local Whisper 🤗☆697Updated 6 months ago
- Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation…☆1,330Updated 4 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆294Updated 8 months ago
- A high quality and fast TTS repository☆486Updated last month