thevoicecompany / gazelle-trainView external linksLinks
Joint speech-language model - respond directly to audio!
☆30May 13, 2024Updated last year
Alternatives and similar repositories for gazelle-train
Users that are interested in gazelle-train are comparing it to the libraries listed below
Sorting:
- ☆16Oct 6, 2024Updated last year
- Joint speech-language model - respond directly to audio!☆372Jul 1, 2024Updated last year
- Whisper Speech Quality Assessment (WhiSQA)☆16Oct 14, 2025Updated 4 months ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- Dart plugin wrapping the Sherpa-ONNX runtime. Contains example for speech recognition with Flutter☆22Jan 3, 2025Updated last year
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Jun 1, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆33Oct 23, 2025Updated 3 months ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆11Nov 7, 2024Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 4 months ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- ☆14Aug 1, 2025Updated 6 months ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Sep 2, 2024Updated last year
- ☆11May 7, 2022Updated 3 years ago
- Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"☆10Jul 8, 2020Updated 5 years ago
- ☆11Nov 28, 2025Updated 2 months ago
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆53Jan 18, 2024Updated 2 years ago
- ☆28Oct 7, 2025Updated 4 months ago
- [Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.☆34Mar 11, 2025Updated 11 months ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Oct 2, 2024Updated last year
- ☆11Oct 14, 2023Updated 2 years ago
- ☆13Oct 27, 2021Updated 4 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago
- ☆14Aug 19, 2024Updated last year
- The Official PyTorch Implementation of "Mel-McNet: A Mel-Scale Framework for Online Multichannel Speech Enhancement" [Interspeech 2025]☆21Jun 9, 2025Updated 8 months ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification☆30Mar 24, 2023Updated 2 years ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- ☆36Jan 6, 2026Updated last month
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆16Sep 13, 2024Updated last year
- ☆14Feb 9, 2023Updated 3 years ago
- ESLTTS dataset☆16Feb 6, 2025Updated last year