nari-labs / dia
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆15,109Updated this week
Alternatives and similar repositories for dia
Users that are interested in dia are comparing it to the libraries listed below
Sorting:
- A Conversational Speech Generation Model☆13,139Updated last month
- Towards Human-Sounding Speech☆4,703Updated last week
- Suna - Open Source Generalist AI Agent☆10,495Updated this week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆11,755Updated last week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,188Updated last week
- ☆5,117Updated last month
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆3,500Updated last week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…☆7,086Updated last week
- 🚀 The fast, Pythonic way to build MCP servers and clients☆9,251Updated this week
- Build Real-Time Knowledge Graphs for AI Agents☆8,541Updated this week
- ☆5,748Updated last week
- Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your p…☆43,286Updated this week
- A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.☆13,298Updated this week
- Open Source framework for voice and multimodal conversational AI☆6,065Updated this week
- SOTA Open Source TTS☆21,025Updated last month
- Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI O…☆5,444Updated last week
- SkyReels-V2: Infinite-length Film Generative model☆2,183Updated last week
- Run AI Agent in your browser.☆12,919Updated this week
- Agno is a lightweight library for building Agents with memory, knowledge, tools and reasoning.☆26,465Updated this week
- Fully local web research and report writing assistant☆7,308Updated last month
- Agent S: an open agentic framework that uses computers like a human☆4,558Updated this week
- The python library for real-time communication☆3,851Updated 3 weeks ago
- MAGI-1: Autoregressive Video Generation at Scale☆2,956Updated last week
- A lightweight, powerful framework for multi-agent workflows☆10,047Updated last week
- Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expres…☆6,551Updated 2 months ago
- 🪄 Create rich visualizations with AI☆11,504Updated this week
- https://hf.co/hexgrad/Kokoro-82M☆2,777Updated last week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,714Updated 9 months ago
- Lets make video diffusion practical!☆12,540Updated last week
- A fast multimodal LLM for real-time voice☆3,916Updated 3 months ago