openai / openai-realtime-embedded-sdk

A SDK to using the Realtime API with Microcontrollers like the ESP32

☆1,468

Alternatives and similar repositories for openai-realtime-embedded-sdk:

Users that are interested in openai-realtime-embedded-sdk are comparing it to the libraries listed below

wwbin2017 / bailing
百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，集成DeepSeek R1等优秀大模型，时延低至800ms，Mac等低配置也可运行，支持打断
☆633Updated this week
78 / xiaozhi
Build your own AI friend
☆353Updated 3 weeks ago
wangzongming / esp-ai
The simplest and lowest-cost AI integration solution. If you like this project, please give it a Star~ | 最简单、最低成本的AI接入方案。喜欢本项目的话点个 Star 吧…
☆512Updated this week
facebookresearch / spiritlm
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
☆879Updated 3 months ago
thewh1teagle / kokoro-onnx
TTS with kokoro and onnx runtime
☆1,614Updated last week
NVIDIA-AI-Blueprints / pdf-to-podcast
Transform PDFs into AI podcasts for engaging on-the-go audio content.
☆533Updated 2 weeks ago
dsa / fast-voice-assistant
⚡ Insanely fast AI voice assistant with <500ms response times
☆373Updated 2 months ago
FunAudioLLM / SenseVoice
Multilingual Voice Understanding Model
☆4,551Updated last month
zcaceres / markdownify-mcp
A Model Context Protocol server for converting almost anything to Markdown
☆303Updated 3 weeks ago
usefulsensors / moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
☆2,572Updated 2 weeks ago
THUDM / GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
☆2,669Updated 2 months ago
astramind-ai / Auralis
A Fast TTS Engine
☆451Updated 3 weeks ago
corbt / agent.exe
☆3,388Updated 3 months ago
gpt-omni / mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…
☆3,157Updated 3 months ago
ictnlp / LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…
☆2,809Updated 3 months ago
QwenLM / Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
☆1,513Updated 6 months ago
Standard-Intelligence / hertz-dev
first base model for full-duplex conversational audio
☆1,707Updated last month
pingcap / autoflow
pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…
☆2,274Updated this week
theredsix / cerebellum
Browser automation system that uses AI-driven planning to navigate web pages and perform goals.
☆730Updated last month
pipecat-ai / rtvi-web-demo
Example UI implementing the RTVI web client
☆474Updated 2 months ago
CerebriumAI / examples
Examples for Cerebrium Serverless GPUs
☆461Updated this week
aiola-lab / whisper-medusa
Whisper with Medusa heads
☆822Updated last week
memodb-io / memobase
Profile-Based Long-Term Memory for AI Applications
☆551Updated this week
Explorerlowi / ESP32_AI_LLM
本项目使用esp32、esp32s3接入Chatgpt、Claude、讯飞星火、豆包等15款大模型，实现语音对话聊天，支持语音唤醒、连续对话、音乐播放等功能，同时外接了一块显示屏实时显示对话的内容。
☆334Updated 2 months ago
echohive42 / AI-reads-books-page-by-page
AI reads books: Page-by-Page PDF Knowledge Extractor & Summarizer. script performs an intelligent page-by-page analysis of PDF books, met…
☆1,355Updated last month
TEN-framework / ten_framework
TEN, a voice agent framework to create conversational AI.
☆544Updated this week
wisupai / e2m
E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with ded…
☆963Updated 5 months ago