allenai / OLMoASRLinks
An open-source implementation of Whisper
☆470Updated 2 months ago
Alternatives and similar repositories for OLMoASR
Users that are interested in OLMoASR are comparing it to the libraries listed below
Sorting:
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆331Updated this week
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆306Updated 7 months ago
- OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.☆620Updated 2 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆292Updated 7 months ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆181Updated 2 months ago
- ☆252Updated 7 months ago
- ☆243Updated 2 weeks ago
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆738Updated 2 months ago
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆480Updated last week
- Kyutai with an "eye"☆232Updated 9 months ago
- A highly compressive and high-quality neural audio codec for speech models.☆176Updated last week
- A high quality and fast TTS repository☆442Updated 2 weeks ago
- ☆430Updated last month
- DACVAE☆184Updated 2 weeks ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆559Updated last month
- ☆345Updated 3 months ago
- ☆440Updated last month
- ☆378Updated 2 months ago
- Fast audio super resolution from 16khz to 48khz.☆167Updated this week
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆944Updated 3 months ago
- TTS model capable of streaming conversational audio in realtime.☆1,007Updated last month
- Make text LLMs listen and speak☆1,058Updated 2 weeks ago
- ☆339Updated 4 months ago
- Self-host the ultra-lightweight Kitten TTS model with this enhanced API server with an intuitive Web UI, large text processing for audiob…☆225Updated 5 months ago
- VLLM Port of the Chatterbox TTS model☆359Updated 2 months ago
- This is the official repo for the paper "LongCat-Flash-Omni Technical Report"☆451Updated 3 weeks ago
- Soprano: Instant, Ultra-Realistic Text-to-Speech☆677Updated last week
- ☆635Updated 2 months ago
- Official implementation of "Continuous Autoregressive Language Models"☆684Updated last month
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆347Updated 9 months ago