kyutai-labs/hibiki

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kyutai-labs/hibiki)

kyutai-labs / hibiki

Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- Hibiki adapts its flow to accumulate just enough context to produce a correct translation in real-time, chunk by chunk.

☆1,486

Alternatives and similar repositories for hibiki

Users that are interested in hibiki are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,646May 16, 2026Updated 2 months ago
kyutai-labs / moshi-finetune
View on GitHub
☆473Oct 3, 2025Updated 9 months ago
kyutai-labs / delayed-streams-modeling
View on GitHub
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
☆2,982Jan 26, 2026Updated 5 months ago
kyutai-labs / moshivis
View on GitHub
Kyutai with an "eye"
☆254Mar 26, 2025Updated last year
kyutai-labs / moshi-swift
View on GitHub
☆140Jun 26, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
edwko / OuteTTS
View on GitHub
Interface for OuteTTS models.
☆1,435Mar 23, 2026Updated 3 months ago
canopyai / Orpheus-TTS
View on GitHub
Towards Human-Sounding Speech
☆6,251Dec 5, 2025Updated 7 months ago
kyutai-labs / unmute
View on GitHub
Make text LLMs listen and speak
☆1,366Updated this week
kyutai-labs / moshi-webrtc
View on GitHub
Proof of concept for running moshi/hibiki using webrtc
☆21Feb 28, 2025Updated last year
fixie-ai / ultravox
View on GitHub
A fast multimodal LLM for real-time voice
☆4,476Dec 12, 2025Updated 7 months ago
Zyphra / Zonos
View on GitHub
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expres…
☆7,229Mar 5, 2025Updated last year
kyutai-labs / hibiki-zero
View on GitHub
A real-time and multilingual speech translation model
☆262Feb 13, 2026Updated 5 months ago
SesameAILabs / csm
View on GitHub
A Conversational Speech Generation Model
☆14,701May 27, 2025Updated last year
facebookresearch / audiobox-aesthetics
View on GitHub
Unified automatic quality assessment for speech, music, and sound.
☆744Jun 5, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ictnlp / StreamSpeech
View on GitHub
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
☆1,278Jun 29, 2025Updated last year
pipecat-ai / smart-turn
View on GitHub
☆1,478Jan 29, 2026Updated 5 months ago
yangdongchao / RSTnet
View on GitHub
Real-time Speech-Text Foundation Model Toolkit (wip)
☆256Mar 26, 2025Updated last year
facebookresearch / omnilingual-asr
View on GitHub
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
☆2,854Dec 30, 2025Updated 6 months ago
facebookresearch / FlowDec
View on GitHub
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
☆212Jun 22, 2026Updated 3 weeks ago
kyutai-labs / sphn
View on GitHub
python bindings for symphonia/opus - read various audio formats from python and write opus files
☆80Jan 7, 2026Updated 6 months ago
Camb-ai / MARS5-TTS
View on GitHub
MARS5 speech model (TTS) from CAMB.AI
☆2,816Aug 1, 2024Updated last year
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,582Dec 10, 2024Updated last year
nari-labs / dia
View on GitHub
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆19,341Nov 19, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / spiritlm
View on GitHub
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
☆928Oct 28, 2024Updated last year
janhq / ichigo
View on GitHub
Local realtime voice AI
☆2,490Nov 26, 2025Updated 7 months ago
Standard-Intelligence / hertz-dev
View on GitHub
first base model for full-duplex conversational audio
☆1,794Jan 5, 2025Updated last year
NVIDIA / audio-flamingo
View on GitHub
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
☆1,157Dec 15, 2025Updated 7 months ago
huggingface / speech-to-speech
View on GitHub
Build local voice agents with open-source models
☆6,242Updated this week
multimodal-art-projection / YuE
View on GitHub
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
☆6,336Jun 4, 2025Updated last year
MYZY-AI / Muyan-TTS
View on GitHub
☆480May 19, 2025Updated last year
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,314Aug 10, 2024Updated last year
SWivid / F5-TTS
View on GitHub
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
☆14,986Jul 5, 2026Updated 2 weeks ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
fluxions-ai / vui
View on GitHub
Real-time voice assistant — WebRTC streaming, faster-whisper ASR, local LLM, Vui Nano (300M) TTS. OpenAI Realtime API compatible. Voice c…
☆727Jul 9, 2026Updated last week
MatthewCYM / VoiceBench
View on GitHub
[TACL'26] VoiceBench: Benchmarking LLM-Based Voice Assistants
☆378Jun 11, 2026Updated last month
FunAudioLLM / FunMusic
View on GitHub
A fundamental toolkit designed for music, song, and audio generation
☆1,369May 20, 2025Updated last year
gradio-app / fastrtc
View on GitHub
The python library for real-time communication
☆4,616Jan 12, 2026Updated 6 months ago
facebookresearch / seamless_communication
View on GitHub
Foundational Models for State-of-the-Art Speech and Text Translation
☆11,816Apr 8, 2026Updated 3 months ago
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,091Jan 8, 2025Updated last year
Stability-AI / stable-codec
View on GitHub
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
☆437Updated this week