allenai / OLMoASRLinks
An open-source implementation of Whisper
☆447Updated last week
Alternatives and similar repositories for OLMoASR
Users that are interested in OLMoASR are comparing it to the libraries listed below
Sorting:
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆288Updated 4 months ago
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆193Updated 2 weeks ago
- Kyutai with an "eye"☆222Updated 6 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆284Updated 5 months ago
- Make text LLMs listen and speak☆910Updated last week
- ☆234Updated 4 months ago
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆625Updated 3 weeks ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆151Updated this week
- ☆634Updated 2 months ago
- ☆309Updated 2 weeks ago
- ☆265Updated last month
- Self-host the ultra-lightweight Kitten TTS model with this enhanced API server with an intuitive Web UI, large text processing for audiob…☆206Updated 2 months ago
- ☆184Updated this week
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆334Updated 6 months ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆760Updated 3 weeks ago
- VLLM Port of the Chatterbox TTS model☆313Updated last month
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆367Updated last month
- Inference, Fine Tuning and many more recipes with Gemma family of models☆271Updated 3 months ago
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆267Updated this week
- ☆826Updated last month
- Open Audio Watermarking Tool☆334Updated 3 months ago
- AudioStory: Generating Long-Form Narrative Audio with Large Language Models☆282Updated 3 weeks ago
- ☆522Updated 2 weeks ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆218Updated 5 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆159Updated last month
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆127Updated 2 months ago
- Collection of Open Source Speech Data☆161Updated 2 weeks ago
- ☆144Updated 2 months ago
- Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.☆791Updated 3 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆212Updated 5 months ago