Ikaros-521/RealtimeSTT_LLM_TTS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Ikaros-521/RealtimeSTT_LLM_TTS)

Ikaros-521 / RealtimeSTT_LLM_TTS

实时STT，连接OpenAI接口/智谱AI（流式LLM）和GPT-SOVITS/Edge-TTS，通过网页的方式，进行跨网络的服务调用，实现实时对话的效果

☆433

Alternatives and similar repositories for RealtimeSTT_LLM_TTS

Users that are interested in RealtimeSTT_LLM_TTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ABexit / ASR-LLM-TTS
View on GitHub
This is a speech interaction system built on an open-source model, integrating ASR, LLM, and TTS in sequence. The ASR model is SenceVoice…
☆1,262Jun 3, 2026Updated last month
Kedreamix / Linly-Talker
View on GitHub
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LL…
☆3,400Feb 10, 2026Updated 5 months ago
lililuya / Graduation-Project
View on GitHub
基于大模型的高质量情感虚拟人系统(Gradio+FUNASR+ChatGLM2-6B+GPT-SOVITS+EAT+GFPGAN)
☆36Aug 6, 2025Updated 11 months ago
wwbin2017 / bailing
View on GitHub
百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，集成DeepSeek R1等优秀大模型，接入openClaw，真正的个人语音助手，时延低至800ms，Mac等低配置也可运行，支持打断
☆1,742Apr 6, 2026Updated 3 months ago
Hujiazeng / Vach
View on GitHub
Real time streaming talking head
☆478May 17, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lipku / LiveTalking
View on GitHub
Real time interactive streaming digital human
☆8,499Jul 19, 2026Updated last week
Ikaros-521 / AI-Vtuber
View on GitHub
AI Vtuber是一个由【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】驱动的虚拟主播【Live2D/UE/xuniren】，可以在【Bilibili/抖音/…
☆4,410Jul 29, 2025Updated 11 months ago
zxs731 / VoiceAssistant
View on GitHub
☆16May 14, 2024Updated 2 years ago
QwenAudio / SenseVoice
View on GitHub
Open-source SenseVoiceSmall model for Mandarin, Cantonese, English, Japanese, and Korean ASR, language ID, emotion recognition, and audio…
☆8,935Updated this week
2DIPW / gpt_sovits_infer_with_emotion
View on GitHub
基于中文文本情绪分析自动切换参考音频的 GPT-SoVITS 推理 Demo
☆108Mar 8, 2024Updated 2 years ago
0x5446 / api4sensevoice
View on GitHub
API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition,…
☆538Oct 23, 2024Updated last year
modelscope / FunASR
View on GitHub
Open-source speech recognition toolkit for training, inference, streaming ASR, VAD, punctuation, speaker diarization pipelines, and OpenA…
☆19,467Updated this week
Acoucou / MY_Assistant
View on GitHub
☆14Jan 14, 2024Updated 2 years ago
gan / glm4v-assistant
View on GitHub
Sample GLM4V + ChatTTS AI assistant
☆85Jun 6, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
KoljaB / RealtimeSTT
View on GitHub
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…
☆10,005Jun 12, 2026Updated last month
Henry-23 / VideoChat
View on GitHub
实时交互数字人，可自定义形象与音色，支持音色克隆，对话延迟低至3s。Real-time voice interactive digital human, customizable appearance and voice, supporting voice cloning,…
☆1,296Dec 18, 2025Updated 7 months ago
Ikaros-521 / digital_human_video_player
View on GitHub
洛曦数字人视频播放器，带HTTP API，使用gradio api对接Easy-Wav2Lip、Sadtalker、GeneFacePlusPlus、MuseTalk，也可以用于播放本地视频
☆173Oct 20, 2024Updated last year
lutongyv / Textin_Tester
View on GitHub
如需体验textin文档解析，请点击https://cc.co/16YSIy
☆21Jul 9, 2024Updated 2 years ago
v3ucn / live2d-TTS-LLM-GPT-SoVITS-Vtuber
View on GitHub
低成本的简单基于live2d TTS文字转语音和大模型聊天的直播解决方案
☆281Jul 4, 2024Updated 2 years ago
kleinlee / DH_live
View on GitHub
每个人都能用的数字人
☆2,080May 21, 2026Updated 2 months ago
2noise / ChatTTS
View on GitHub
A generative speech model for daily dialogue.
☆39,678Apr 10, 2026Updated 3 months ago
QwenAudio / CosyVoice
View on GitHub
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆22,400May 25, 2026Updated 2 months ago
RVC-Boss / GPT-SoVITS
View on GitHub
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
☆60,109Updated this week
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
TouchSky-Lab / Awesome-Text-to-Speech-TTS
View on GitHub
Awesome TTS
☆63Sep 16, 2021Updated 4 years ago
fishaudio / Bert-VITS2
View on GitHub
vits2 backbone with multilingual-bert
☆8,780Updated this week
waityousea / xuniren
View on GitHub
☆601Jan 8, 2024Updated 2 years ago
ruzhila / voiceapi
View on GitHub
Streaming ASR and TTS based on FastAPI+ sherpa-onnx
☆222Nov 2, 2025Updated 8 months ago
MemFire-Cloud / supabase-wechat-stable
View on GitHub
An isomorphic Javascript client for Supabase.
☆10Oct 24, 2022Updated 3 years ago
lhl / voicechat2
View on GitHub
Local SRT/LLM/TTS Voicechat
☆775Oct 12, 2024Updated last year
jianchang512 / ChatTTS-ui
View on GitHub
一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with su…
☆7,626Jun 14, 2026Updated last month
AQinTrue / typora-alist-image-uploader
View on GitHub
这是一个简单的工具，用于方便地将Typora编辑器中的图片上传至Alist云存储服务
☆12Oct 8, 2025Updated 9 months ago
morettt / SenseAI
View on GitHub
一个结合了ASR+LLM+TTS+监控的多功能AI机器人。支持所有以open ai为API调用格式的模型。支持LLM模型流式输出，以及对话打断、视频对话
☆30Apr 15, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yugongcoding / yugong_wiki
View on GitHub
愚公wiki是一款轻量的在线博客、知识库、个人笔记或企业文档协作平台，可下载桌面版作为个人笔记本，也可以在线编辑文档，当然也可以自行进行服务化部署，因为这是一款完全开源的写作平台
☆18Jul 22, 2024Updated 2 years ago
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,373Updated this week
lenML / Speech-AI-Forge
View on GitHub
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
☆1,415May 21, 2026Updated 2 months ago
PeterH0323 / Streamer-Sales
View on GitHub
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文…
☆3,743Mar 8, 2025Updated last year
jianchang512 / zh_recogn
View on GitHub
将音频或视频中的中文语音识别并导出为srt字幕，基于魔塔社区Paraformer模型
☆116Jul 10, 2024Updated 2 years ago
KevinWang676 / Bark-Voice-Cloning
View on GitHub
Bark Voice Cloning and Voice Cloning for Chinese Speech
☆2,949May 31, 2026Updated last month
zai-org / GLM-4-Voice
View on GitHub
GLM-4-Voice | 端到端中英语音对话模型
☆3,209Dec 5, 2024Updated last year