allenai / OLMoASRLinks
An open-source implementation of Whisper
☆434Updated this week
Alternatives and similar repositories for OLMoASR
Users that are interested in OLMoASR are comparing it to the libraries listed below
Sorting:
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆284Updated 3 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆279Updated 4 months ago
- Make text LLMs listen and speak☆884Updated this week
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆546Updated this week
- ☆232Updated 4 months ago
- Kyutai with an "eye"☆218Updated 5 months ago
- Self-host the ultra-lightweight Kitten TTS model with this enhanced API server with an intuitive Web UI, large text processing for audiob…☆204Updated last month
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆306Updated 5 months ago
- ☆250Updated 3 weeks ago
- ☆298Updated 2 months ago
- ☆385Updated this week
- Inference, Fine Tuning and many more recipes with Gemma family of models☆268Updated 2 months ago
- VLLM Port of the Chatterbox TTS model☆303Updated 2 weeks ago
- ☆634Updated last month
- ☆639Updated last month
- ☆516Updated last month
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆218Updated 4 months ago
- Streaming and Fine-tuning for Chatterbox TTS☆185Updated 3 months ago
- ☆155Updated 5 months ago
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆365Updated 3 weeks ago
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆265Updated 3 weeks ago
- Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.☆749Updated 2 months ago
- AudioStory: Generating Long-Form Narrative Audio with Large Language Models☆274Updated this week
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆155Updated last month
- MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers☆318Updated 2 weeks ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆208Updated 5 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆123Updated last month
- ☆281Updated 2 months ago
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆57Updated 4 months ago
- ☆851Updated last week