homebrewltd / ichigo

Llama3.1 learns to Listen

☆1,749

Related projects ⓘ

Alternatives and complementary repositories for ichigo

Standard-Intelligence / hertz-dev
first base model for full-duplex conversational audio
☆1,362Updated this week
ictnlp / LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…
☆2,538Updated last month
aiola-lab / whisper-medusa
Whisper with Medusa heads
☆800Updated last week
usefulsensors / moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
☆2,107Updated last week
mezbaul-h / june
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
☆714Updated 3 months ago
fixie-ai / ultravox
A fast multimodal LLM for real-time voice
☆980Updated this week
pipecat-ai / pipecat
Open Source framework for voice and multimodal conversational AI
☆3,346Updated this week
gabrielchua / open-notebooklm
Convert any PDF into a podcast episode!
☆1,438Updated last week
facebookresearch / spiritlm
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
☆755Updated 2 weeks ago
collabora / WhisperFusion
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
☆1,544Updated 3 months ago
lifeiteng / OmniSenseVoice
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
☆699Updated this week
lhl / voicechat2
Local SRT/LLM/TTS Voicechat
☆535Updated last month
PromtEngineer / Verbi
A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech mode…
☆783Updated last week
huggingface / speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
☆3,499Updated last week
livekit / agents
Build real-time multimodal AI applications 🤖🎙️📹
☆3,929Updated this week
dleemiller / WordLlama
Things you can do with the token embeddings of an LLM
☆1,311Updated this week
lamm-mit / PDF2Audio
☆1,065Updated last month
fedirz / faster-whisper-server
☆717Updated this week
CerebriumAI / examples
☆443Updated this week
lumina-ai-inc / chunkr
Vision model based document ingestion
☆1,226Updated this week
alexpinel / Dot
Text-To-Speech, RAG, and LLMs. All local!
☆1,606Updated 4 months ago
huggingface / parler-tts
Inference and training library for high-quality TTS models.
☆4,592Updated last week
kyutai-labs / moshi
☆6,692Updated last week
collabora / WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
☆3,956Updated 4 months ago
souzatharsis / podcastfy
An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Co…
☆990Updated this week
DigitalPhonetics / IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
☆1,448Updated this week
nrl-ai / llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and …
☆408Updated last month
bklieger-groq / g1
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
☆3,864Updated last month
THUDM / LongWriter
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
☆1,462Updated 2 weeks ago
lm-sys / RouteLLM
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
☆3,220Updated 3 months ago