tarzain / crosstalk
a simple system for 2-way interruptible voice interactions between human and LLM
☆17Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for crosstalk
- Supervoice diffusion enhance☆25Updated 3 months ago
- Joint speech-language model - respond directly to audio!☆30Updated 6 months ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆13Updated 2 weeks ago
- Audio tokenization, in the fastest way possible!☆45Updated 2 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- ☆61Updated 3 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆16Updated this week
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated last week
- Collection of scripts from mHuBERT-147.☆22Updated 4 months ago
- ☆9Updated last month
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆50Updated 3 years ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆11Updated 4 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆12Updated 5 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆12Updated 2 months ago
- GPT for FACodec☆13Updated 7 months ago
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a l…