kyutai-labs/tts_longeval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kyutai-labs/tts_longeval)

kyutai-labs / tts_longeval

☆30

Alternatives and similar repositories for tts_longeval

Users that are interested in tts_longeval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
choiHkk / Transformer-TTS-V2
View on GitHub
☆25Mar 6, 2024Updated 2 years ago
qiuqiao / DDSP-HiFiGAN
View on GitHub
基于PC-DDSP和nsf-HiFiGAN的声码器
☆19Jul 17, 2023Updated 3 years ago
NUS-HPC-AI-Lab / MoST
View on GitHub
MoST: Mixing Speech and Text with Modality-Aware Mixture of Experts
☆33Jan 15, 2026Updated 6 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
yongaifadian1 / MNV-17
View on GitHub
Qwen2.5-Omni fine-tuned on MNV-17 dataset for nonverbal vocalization recognition
☆31Nov 13, 2025Updated 8 months ago
Ruiqi-Yan / Awesome-Audio-Editing
View on GitHub
A curated list of models, benchmarks, tools and guides for audio editing
☆35Jul 7, 2026Updated 3 weeks ago
IDEA-Emdoor-Lab / DistilCodec
View on GitHub
A Neural Audio Codec (NAC) for Universal Audio
☆47May 30, 2025Updated last year
kyutai-labs / nanoGPTaudio
View on GitHub
Code for the blog "Neural audio codecs: how to get audio into LLMs"
☆174Oct 20, 2025Updated 9 months ago
rwth-i6 / i6_core
View on GitHub
Sisyphus recipies for ASR
☆19Jul 17, 2026Updated last week
AmphionTeam / FlexiCodec
View on GitHub
[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
☆51Jul 1, 2026Updated 3 weeks ago
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
IDEA-Emdoor-Lab / UniTTS
View on GitHub
A TTS Trained on Universal Audio.
☆41Jun 6, 2025Updated last year
hlt-mt / Speech-MASSIVE
View on GitHub
Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…
☆25Oct 8, 2025Updated 9 months ago
lucky-bai / wasm-speech-streaming
View on GitHub
Offline streaming speech-to-text in the browser
☆27Aug 28, 2025Updated 11 months ago
0nutation / SLMTokBench
View on GitHub
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Aug 29, 2023Updated 2 years ago
Picovoice / text-to-speech-benchmark
View on GitHub
Text-to-Speech Benchmark
☆26Apr 2, 2026Updated 3 months ago
eugenehp / bitnet-cpp-rs
View on GitHub
Rust bindings for bitnet.cpp based on llama-cpp-4
☆17Dec 28, 2025Updated 7 months ago
Adibian / Persian-MultiSpeaker-Tacotron2
View on GitHub
Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.
☆13Oct 2, 2025Updated 9 months ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
JaesungHuh / ca-subtitle
View on GitHub
Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"
☆21Nov 3, 2025Updated 8 months ago
lourson1091 / audiobertscore
View on GitHub
☆15Nov 10, 2025Updated 8 months ago
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 6 months ago
yfyeung / DS-WED
View on GitHub
[ICASSP 2026] Official code for "Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration"
☆17Apr 16, 2026Updated 3 months ago
kyutai-labs / hibiki-zero
View on GitHub
A real-time and multilingual speech translation model
☆264Feb 13, 2026Updated 5 months ago
Mu-Y / DiariST
View on GitHub
☆18Sep 19, 2023Updated 2 years ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sarulab-speech / Sidon
View on GitHub
Training code and dataset cleasing with Sidon
☆169Apr 24, 2026Updated 3 months ago
sii-research / OpenMOSS
View on GitHub
OpenMOSS presents a collection of our research on LLMs, supported by SII, Fudan and Mosi.
☆30Updated this week
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆35Sep 25, 2025Updated 10 months ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
apptek / SubER
View on GitHub
SubER - Subtitle Edit Rate
☆26May 7, 2026Updated 2 months ago
gau-nernst / kokoro
View on GitHub
https://hf.co/hexgrad/Kokoro-82M
☆14Jan 14, 2026Updated 6 months ago
XiaomiMiMo / MiMo-Audio-Training
View on GitHub
☆109Oct 16, 2025Updated 9 months ago