tincans-ai / gazelle-inferenceLinks
proof of concept conversation orchestrator with a speech-language model
☆20Updated 8 months ago
Alternatives and similar repositories for gazelle-inference
Users that are interested in gazelle-inference are comparing it to the libraries listed below
Sorting:
- Joint speech-language model - respond directly to audio!☆30Updated last year
- a simple system for 2-way interruptible voice interactions between human and LLM☆29Updated last year
- Open TTS models, built for streaming on the edge☆43Updated 3 months ago
- Supervoice diffusion enhance☆27Updated 11 months ago
- Audio tokenization, in the fastest way possible!☆52Updated 9 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- VoiceBox neural network implementation☆109Updated 10 months ago
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆21Updated 5 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆90Updated last month
- Speaker Diarization with Transformers☆68Updated 2 weeks ago
- A list of podcast URLs scraped from the Apple podcast database in late 2021, including a script for downloading those podcasts.☆41Updated 3 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated last month
- Text-to-Speech Latency Benchmark☆14Updated this week
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated 2 years ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆132Updated last year
- StyleTTS 2 Optimized Training Fork☆31Updated 4 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated last year
- A synthetic story narration dataset to study small audio LMs.☆32Updated last year
- ☆62Updated 11 months ago
- ☆40Updated 4 months ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆24Updated last year
- Tunable pipelines☆34Updated 4 months ago
- Unofficial implementation of wavenext vocoder☆47Updated 9 months ago
- ☆15Updated 3 months ago
- ☆15Updated 3 months ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆14Updated 9 months ago
- Speaker diarization service☆23Updated 2 months ago
- A lightweight Python library for running TTS models with a unified API.☆18Updated 4 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆99Updated 8 months ago
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆56Updated last month