thevoicecompany/gazelle-train

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thevoicecompany/gazelle-train)

thevoicecompany / gazelle-train

Joint speech-language model - respond directly to audio!

☆30

Alternatives and similar repositories for gazelle-train

Users that are interested in gazelle-train are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thevoicecompany / bench.audio
View on GitHub
☆16Oct 6, 2024Updated last year
tincans-ai / gazelle
View on GitHub
Joint speech-language model - respond directly to audio!
☆374Jul 1, 2024Updated 2 years ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
jamesparsloe / llm.speech
View on GitHub
Trying to build an all in one speech-text language model - a bit like GPT-4o
☆22Jun 1, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
ErikEkstedt / conv_ssl
View on GitHub
☆14Feb 9, 2023Updated 3 years ago
asuni / PitchSqueezer
View on GitHub
A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation
☆38Jan 17, 2024Updated 2 years ago
Yuanshi9815 / LiteFocus
View on GitHub
[Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.
☆34Mar 11, 2025Updated last year
nmfisher / sherpa_onnx_dart
View on GitHub
Dart plugin wrapping the Sherpa-ONNX runtime. Contains example for speech recognition with Flutter
☆22Jan 3, 2025Updated last year
MiuLab / Lattice-Transformer-SLU
View on GitHub
Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"
☆10Jul 8, 2020Updated 6 years ago
ftshijt / speech_evaluation
View on GitHub
A toolkit dedicate for speech evaluation.
☆23Sep 26, 2024Updated last year
qiuk2 / AAR
View on GitHub
[Official Implementation] Acoustic Autoregressive Modeling 🔥
☆75Aug 24, 2024Updated last year
light1726 / BetaVAE_VC
View on GitHub
Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"
☆43Apr 10, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
csukuangfj / icefall
View on GitHub
☆11Updated this week
ErikEkstedt / TurnGPT
View on GitHub
TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog
☆71May 18, 2024Updated 2 years ago
lcn-kul / xls-r-analysis-sqa
View on GitHub
Analysis of XLS-R for Speech Quality Assessment
☆15Feb 10, 2025Updated last year
NTIA / WEnets
View on GitHub
Reference Implementations of Waveform Evaluation Networks (WEnets)
☆27Sep 18, 2023Updated 2 years ago
tim-gromeyer / VoiceAssistant
View on GitHub
Empower Your Voice, Secure Your Privacy - Experience VoiceAssistant, Your Customizable Offline Voice Assistant!
☆17Jul 15, 2025Updated last year
tincans-ai / gazelle-inference
View on GitHub
proof of concept conversation orchestrator with a speech-language model
☆20Oct 19, 2024Updated last year
Enescigdem / SignLanguageRecognizer
View on GitHub
☆16Nov 8, 2020Updated 5 years ago
cyhuang-tw / robust-vc
View on GitHub
☆11May 7, 2022Updated 4 years ago
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zocomputer / substrate-typescript
View on GitHub
Substrate TypeScript SDK
☆10Sep 20, 2024Updated last year
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
yuhangear / wenet-android
View on GitHub
☆13Oct 27, 2021Updated 4 years ago
csukuangfj / kaldi-hmm-gmm
View on GitHub
☆28Apr 24, 2026Updated 2 months ago
frankyoujian / Edge-Punct-Casing
View on GitHub
☆33Feb 4, 2025Updated last year
atosystem / SSL_Interface
View on GitHub
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆16Nov 19, 2024Updated last year
ZZDoog / Speaker2Dubber
View on GitHub
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
☆34Updated this week
kyegomez / Audio-xLSTMs
View on GitHub
Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch
☆19Jul 13, 2026Updated last week
wentaozhu / speechnas
View on GitHub
SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification
☆30Mar 24, 2023Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
sentient-engineering / multi-agent-fsm
View on GitHub
A simple way to write multi agents systems
☆37Sep 5, 2024Updated last year
arnabdas8901 / StarGAN-VC_PlusPlus
View on GitHub
☆11Aug 11, 2023Updated 2 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
mfischer-ucl / metappearance
View on GitHub
Metappearance: Meta-Learning for Visual Appearance Reproduction
☆22Sep 19, 2022Updated 3 years ago
ShovalMessica / NAST
View on GitHub
Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…
☆46Jul 2, 2024Updated 2 years ago
CMsmartvoice / Unet-TTS
View on GitHub
One-shot TTS with Improved Unseen Speaker and Style Transfer
☆37Mar 2, 2022Updated 4 years ago
Alexander-H-Liu / dinosr
View on GitHub
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
☆53Jan 18, 2024Updated 2 years ago