tarzain / crosstalk
a simple system for 2-way interruptible voice interactions between human and LLM
☆25Updated last year
Alternatives and similar repositories for crosstalk:
Users that are interested in crosstalk are comparing it to the libraries listed below
- A lightweight Python library for running TTS models with a unified API.☆17Updated last month
- Joint speech-language model - respond directly to audio!☆30Updated 10 months ago
- proof of concept conversation orchestrator with a speech-language model☆19Updated 5 months ago
- Audio tokenization, in the fastest way possible!☆49Updated 7 months ago
- Open TTS models, built for streaming on the edge☆39Updated 2 weeks ago
- ☆62Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆61Updated 3 weeks ago
- Agentic RAG to help you build a startup🚀☆16Updated 3 weeks ago
- StyleTTS 2 Optimized Training Fork☆26Updated 2 months ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆16Updated 5 months ago
- llmon-py is a multimodal webui for Llama 3-8B.☆16Updated 9 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 10 months ago
- Speaker diarization service☆21Updated last month
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated 3 months ago
- A streaming whisper server for on-prem transcription☆20Updated 7 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆27Updated 5 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Dippy Synthetic Speech Subnet☆16Updated last week
- Supervoice diffusion enhance☆26Updated 8 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆14Updated 3 weeks ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- Mission to create a Hebrew TTS model as powerful and user-friendly as WaveNet☆33Updated 2 months ago
- Speaker Diarization with Transformers☆64Updated 10 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆20Updated 3 weeks ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆59Updated 7 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆21Updated 4 months ago
- Experiments with BitNet inference on CPU☆53Updated last year
- Developer showcase of projects built on Cartesia☆17Updated 7 months ago
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a l…☆22Updated 8 months ago
- (WIP) A retrain of F5-TTS on permissively-licensed data☆9Updated 2 weeks ago