facebookresearch / seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
☆11,529Updated 6 months ago
Alternatives and similar repositories for seamless_communication
Users that are interested in seamless_communication are comparing it to the libraries listed below
Sorting:
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆3,856Updated 4 months ago
- ☆8,385Updated 11 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆15,835Updated 2 weeks ago
- Faster Whisper transcription with CTranslate2☆16,090Updated 3 weeks ago
- Official inference library for Mistral models☆10,228Updated 2 months ago
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,986Updated 9 months ago
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆21,987Updated 2 months ago
- Inference code for CodeLlama models☆16,296Updated 9 months ago
- An Open Source text-to-speech system built by inverting Whisper.☆4,249Updated last month
- Large Language Model Text Generation Inference☆10,145Updated this week
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆9,059Updated last month
- Universal LLM Deployment Engine with ML Compilation☆20,649Updated 3 weeks ago
- Python bindings for llama.cpp☆9,107Updated 2 weeks ago
- Inference and training library for high-quality TTS models.☆5,257Updated 5 months ago
- Text-to-Audio/Music Generation☆2,420Updated 7 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/☆7,868Updated last year
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,403Updated 9 months ago
- Inference code for Llama models☆58,242Updated 3 months ago
- Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!☆38,130Updated this week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,741Updated 9 months ago
- Tensor library for machine learning☆12,567Updated this week
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆40,062Updated 9 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆22,540Updated 9 months ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆17,328Updated last week
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.☆14,523Updated 2 months ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆38,593Updated last month
- Inference Llama 2 in one file of pure C☆18,399Updated 9 months ago
- 🔊 Text-Prompted Generative Audio Model☆37,838Updated 9 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,439Updated 11 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,231Updated this week