facebookresearch/seamless_communication

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/seamless_communication)

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

☆11,823

Alternatives and similar repositories for seamless_communication

Users that are interested in seamless_communication are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / audiocraft
View on GitHub
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…
☆23,525Mar 3, 2026Updated 4 months ago
suno-ai / bark
View on GitHub
🔊 Text-Prompted Generative Audio Model
☆39,216Aug 19, 2024Updated last year
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,820Aug 16, 2024Updated last year
Plachtaa / VALL-E-X
View on GitHub
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
☆7,936Feb 11, 2024Updated 2 years ago
myshell-ai / OpenVoice
View on GitHub
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆37,026Apr 19, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆105,641Apr 15, 2026Updated 3 months ago
meta-llama / codellama
View on GitHub
Inference code for CodeLlama models
☆16,286Aug 12, 2024Updated last year
open-mmlab / Amphion
View on GitHub
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…
☆9,967Mar 25, 2026Updated 4 months ago
facebookresearch / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆32,255Sep 30, 2025Updated 9 months ago
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,098Jan 8, 2025Updated last year
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,268Jul 13, 2026Updated 2 weeks ago
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,318Aug 10, 2024Updated last year
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,945Aug 12, 2024Updated last year
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,502May 1, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,304Jul 11, 2026Updated 2 weeks ago
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,581Dec 10, 2024Updated last year
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,545Nov 19, 2025Updated 8 months ago
metavoiceio / metavoice-src
View on GitHub
Foundational model for human-like, expressive TTS
☆4,203Jul 30, 2024Updated last year
netease-youdao / EmotiVoice
View on GitHub
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
☆8,499Aug 13, 2024Updated last year
Stability-AI / generative-models
View on GitHub
Generative Models by Stability AI
☆27,228Dec 16, 2025Updated 7 months ago
meta-llama / llama
View on GitHub
Inference code for Llama models
☆59,527Jan 26, 2025Updated last year
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,382Updated this week
nlpxucan / WizardLM
View on GitHub
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,482Jun 7, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,173Jan 23, 2026Updated 6 months ago
jasonppy / VoiceCraft
View on GitHub
Zero-Shot Speech Editing and Text-to-Speech in the Wild
☆8,505May 30, 2026Updated last month
mlc-ai / mlc-llm
View on GitHub
Universal LLM Deployment Engine with ML Compilation
☆23,002Updated this week
run-llama / llama_index
View on GitHub
LlamaIndex is the leading document agent and OCR platform
☆51,117Updated this week
openinterpreter / openinterpreter
View on GitHub
A coding agent for open models like Kimi K3
☆67,308Updated this week
NVIDIA-NeMo / Speech
View on GitHub
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…
☆17,822Updated this week
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,624Dec 14, 2025Updated 7 months ago
facefusion / facefusion
View on GitHub
Industry leading face manipulation platform
☆29,400Updated this week
microsoft / autogen
View on GitHub
A programming framework for agentic AI
☆59,996Apr 15, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,726May 16, 2026Updated 2 months ago
modelscope / facechain
View on GitHub
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
☆9,507Jun 6, 2025Updated last year
mit-han-lab / streaming-llm
View on GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,249Jul 11, 2024Updated 2 years ago
facebookresearch / nougat
View on GitHub
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆10,052Feb 21, 2025Updated last year
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,717Jun 15, 2026Updated last month
oobabooga / textgen
View on GitHub
Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.
☆47,492Jun 2, 2026Updated last month
Vaibhavs10 / insanely-fast-whisper
View on GitHub
☆12,997Oct 25, 2025Updated 9 months ago