allenai / OLMoASRLinks
An open-source implementation of Whisper
☆475Updated 3 months ago
Alternatives and similar repositories for OLMoASR
Users that are interested in OLMoASR are comparing it to the libraries listed below
Sorting:
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆382Updated last week
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆306Updated 7 months ago
- ☆245Updated last month
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆181Updated 3 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆294Updated 8 months ago
- A highly compressive and high-quality neural audio codec for speech models.☆227Updated last week
- A high quality and fast TTS repository☆486Updated last month
- ☆439Updated last month
- Self-host the ultra-lightweight Kitten TTS model with this enhanced API server with an intuitive Web UI, large text processing for audiob…☆233Updated 5 months ago
- OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.☆628Updated 3 months ago
- ☆502Updated this week
- A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.☆336Updated this week
- Soprano-Factory: Train your own 2000x realtime text-to-speech model☆156Updated 2 weeks ago
- DACVAE☆189Updated last month
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆588Updated 2 weeks ago
- ☆368Updated 3 months ago
- Kyutai with an "eye"☆235Updated 10 months ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆565Updated 2 months ago
- VLLM Port of the Chatterbox TTS model☆364Updated 3 months ago
- ☆257Updated 8 months ago
- Streaming and Fine-tuning for Chatterbox TTS☆262Updated 7 months ago
- Fast audio super resolution from 16khz to 48khz.☆188Updated 3 weeks ago
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆747Updated 3 months ago
- TTS model capable of streaming conversational audio in realtime.☆1,027Updated 2 months ago
- Make text LLMs listen and speak☆1,133Updated last week
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆348Updated 9 months ago
- ☆385Updated 2 months ago
- ☆572Updated 2 weeks ago
- ☆345Updated 5 months ago
- This is the official repo for the paper "LongCat-Flash-Omni Technical Report"☆460Updated last week