SesameAILabs / csm
A Conversational Speech Generation Model
☆13,139Updated last month
Alternatives and similar repositories for csm
Users that are interested in csm are comparing it to the libraries listed below
Sorting:
- Towards Human-Sounding Speech☆4,703Updated last week
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆15,109Updated this week
- https://hf.co/hexgrad/Kokoro-82M☆2,777Updated last week
- A fast multimodal LLM for real-time voice☆3,934Updated 3 months ago
- Open Source framework for voice and multimodal conversational AI☆6,065Updated this week
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching☆2,617Updated this week
- Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expres…☆6,551Updated 2 months ago
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆3,500Updated last week
- An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large…☆16,069Updated last month
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆11,755Updated last week
- YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open☆4,945Updated this week
- Run AI Agent in your browser.☆12,919Updated this week
- An AI web browsing framework focused on simplicity and extensibility.☆11,794Updated this week
- Playwright MCP server☆9,728Updated this week
- Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI O…☆5,527Updated last week
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆1,345Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆12,311Updated this week
- ☆5,117Updated last month
- TTS with kokoro and onnx runtime☆1,960Updated this week
- Playwright Model Context Protocol Server - Tool to automate Browsers and APIs in Claude Desktop, Cline, Cursor IDE and More 🔌☆3,507Updated this week
- Build Real-Time Knowledge Graphs for AI Agents☆8,541Updated this week
- Prompt, run, edit, and deploy full-stack web applications☆14,611Updated 4 months ago
- ☆18,825Updated this week
- Wan: Open and Advanced Large-Scale Video Generative Models☆11,042Updated last week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,188Updated last week
- A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.☆13,298Updated this week
- A lightweight, powerful framework for multi-agent workflows☆10,047Updated last week
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆32,146Updated 3 weeks ago
- React app for inspecting, building and debugging with the Realtime API☆3,218Updated 2 months ago
- DeerFlow is a community-driven framework for deep research, combining language models with tools like web search, crawling, and Python ex…☆8,688Updated this week