tarzain / crosstalk
a simple system for 2-way interruptible voice interactions between human and LLM
☆17Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for crosstalk
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated 2 weeks ago
- Supervoice diffusion enhance☆24Updated 4 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆16Updated 3 weeks ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆13Updated 3 weeks ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆16Updated last week
- Audio tokenization, in the fastest way possible!☆45Updated 2 months ago
- ☆9Updated last month
- ☆61Updated 3 months ago
- proof of concept conversation orchestrator with a speech-language model☆14Updated last month
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆11Updated 5 months ago
- GPT for FACodec☆13Updated 7 months ago
- Collection of scripts from mHuBERT-147.☆22Updated this week
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆40Updated 3 weeks ago
- ☆11Updated last year
- My vocoder experiments☆21Updated last month
- Joint speech-language model - respond directly to audio!☆30Updated 6 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆12Updated 2 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆12Updated this week
- Supervoice Speaker Separation Network☆13Updated 5 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆23Updated last month
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆29Updated last month
- zero-shot realtime TTS system, fully offline, free and open source☆14Updated last week
- Uses deepgram/whisper/custom models to create an LJSpeech dataset for voice model fine tuning☆12Updated 2 weeks ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆84Updated last month
- Speaker diarization service☆19Updated this week
- Efficient approach to speaker diarization using voice characteristics extraction☆68Updated 6 months ago
- Speech enhancement in noisy and reverberant environments using deep neural networks☆15Updated last month
- VoiceBox neural network implementation☆96Updated 3 months ago
- Hifi-like Vocoder implemented in PyTorch☆13Updated 2 years ago