eustlb / speech-to-speechLinks
Speech To Speech: an effort for an open-sourced and modular GPT4-o
☆62Updated 8 months ago
Alternatives and similar repositories for speech-to-speech
Users that are interested in speech-to-speech are comparing it to the libraries listed below
Sorting:
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆55Updated this week
- A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.☆229Updated 5 months ago
- Simulates talk with an AI that can express emotions☆71Updated this week
- ☆188Updated last month
- Efficient approach to speaker diarization using voice characteristics extraction☆96Updated this week
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆82Updated last year
- Service for testing out the new Qwen2.5 omni model☆51Updated last month
- This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming archi…☆136Updated 2 months ago
- A basic voice agent built with Python agents framework☆49Updated last month
- ASR + diarization model server with speculative decoding☆60Updated last year
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.☆26Updated last year
- ☆174Updated last year
- 🗣️ Real‑time, low‑latency voice, vision, and conversational‑memory AI assistant built on LiveKit and local LLMs ✨☆43Updated 3 weeks ago
- Streaming and Fine-tuning for Chatterbox TTS☆99Updated last week
- G2P☆258Updated last month
- Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.☆97Updated last month
- Multilingual extension of the SesameAILabs Conversational Speech Generation Model☆26Updated 2 months ago
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆132Updated last year
- Roomey is a multi-purpose Voice Agent designed to run your personal and business life.☆27Updated last week
- List of curated use cases built using Sesame's CSM 1B☆66Updated 3 weeks ago
- Talk To AI with FastRTC enables natural, real-time voice conversations with AI using WebRTC, offering customizable voices, interfaces, an…☆34Updated 3 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆174Updated 2 months ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆39Updated 6 months ago
- Have a natural voice conversation with an LLM☆250Updated 6 months ago
- Examples of using the llasa-tts models locally☆173Updated 2 months ago
- On-device streaming text-to-speech engine powered by deep learning☆87Updated last week
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆163Updated 2 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆20Updated 8 months ago
- A streaming whisper server for on-prem transcription☆20Updated 10 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago