This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and natural interruption handling.
☆241Nov 24, 2025Updated 3 months ago
Alternatives and similar repositories for On-Device-Speech-to-Speech-Conversational-AI
Users that are interested in On-Device-Speech-to-Speech-Conversational-AI are comparing it to the libraries listed below
Sorting:
- Train and finutune text-to-speech models for Bengali and many other languages!☆18Apr 2, 2025Updated 11 months ago
- List of curated use cases built using Sesame's CSM 1B☆72May 29, 2025Updated 9 months ago
- ☆18May 9, 2024Updated last year
- ☆15May 13, 2024Updated last year
- High-performance, semantic turn detection for conversational AI☆35Oct 1, 2025Updated 5 months ago
- optimized wav2lip☆18Jan 6, 2024Updated 2 years ago
- Bangla Unicode Normalization☆22May 26, 2024Updated last year
- A local, voice-controlled AI assistant with the personality of HAL 9000 from 2001: A Space Odyssey.☆22Aug 16, 2025Updated 6 months ago
- phonetic similarity algorithms☆13Jun 19, 2018Updated 7 years ago
- Official repository of Tapir Lab.'s Lip-Sync Method☆10Oct 3, 2023Updated 2 years ago
- Orca is a workspace for vibe coding built upon the principals of tracking what the agent changes and only keeping what you want☆49Updated this week
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago
- A fourier-based audio-synthesiser wrote in MATLAB as a university project.☆12Jan 19, 2019Updated 7 years ago
- Orpheus-TTS local speech synthesizer written entirely in C#☆29Nov 25, 2025Updated 3 months ago
- ☆56Jun 20, 2025Updated 8 months ago
- Easily create video datasets with auto-captioning for Hunyuan-Video LoRA finetuning☆14Apr 2, 2025Updated 11 months ago
- Docker image for Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation☆11Apr 14, 2024Updated last year
- Survey: A collection of AWESOME papers and resources on the latest research in Object Tracking.☆23Nov 11, 2025Updated 3 months ago
- ☆13Apr 9, 2021Updated 4 years ago
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 5 months ago
- open-webui-runpod-integration☆16Jan 19, 2025Updated last year
- StyleTTS2 + Vocos as a Decoder☆13Mar 24, 2025Updated 11 months ago
- This is a side project where me and my friend try to generate synthetic data in bangla from deepseek-r1. So that can be used for model di…☆11Jun 28, 2025Updated 8 months ago
- Roboflow's inference server to analyze video streams. This project extracts insights from video frames at defined intervals and generates…☆13May 21, 2024Updated last year
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆57May 17, 2025Updated 9 months ago
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Sep 27, 2024Updated last year
- Automatically generate a lip-synced avatar based off of a transcript and audio☆15Feb 17, 2023Updated 3 years ago
- Official implementation of the TTS model Lina-Speech☆179Jan 9, 2025Updated last year
- Update ASR paper everyday☆462Updated this week
- Text Normalizer module use for Bangla as well as English digit convert to textual format, Normalize Date and Extract Date☆14Feb 25, 2026Updated last week
- SoTA open-source TTS☆26Jul 8, 2025Updated 8 months ago
- Towards Human-Sounding Speech☆5,983Dec 5, 2025Updated 3 months ago
- ☆14Aug 19, 2024Updated last year
- Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the Whisper-medium, designed to enhance performance on mul…☆17Jan 20, 2025Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- A Conversational Speech Generation Model☆14,530May 27, 2025Updated 9 months ago
- Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)☆94Dec 3, 2024Updated last year
- Realtime demo, Streaming and Finetuning code for CSM☆444Sep 17, 2025Updated 5 months ago
- Fast audio super resolution from 16khz to 48khz.☆199Jan 3, 2026Updated 2 months ago