LAION-AI / natural_voice_assistant
☆453Updated 3 months ago
Related projects: ⓘ
- Joint speech-language model - respond directly to audio!☆312Updated 2 months ago
- Whisper with Medusa heads☆774Updated last week
- Command Your World with Voice☆368Updated 3 weeks ago
- Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with C…☆475Updated last month
- ☆1,079Updated 2 months ago
- A fast multimodal LLM for real-time voice☆847Updated this week
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,509Updated last month
- ☆181Updated 3 months ago
- Suno AI's Bark model in C/C++ for fast text-to-speech☆684Updated 2 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆677Updated last month
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.☆551Updated last year
- Replace OpenAI with Llama.cpp Automagically.☆276Updated 3 months ago
- The fastest Whisper optimization for automatic speech recognition as a command-line interface ⚡️☆308Updated 3 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆891Updated last week
- ☆244Updated 6 months ago
- ☆161Updated last month
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.☆519Updated 4 months ago
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine☆276Updated 3 weeks ago
- A multimodal, function calling powered LLM webui.☆204Updated 3 months ago
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆260Updated 3 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆421Updated 10 months ago
- Low latency ai companion voice talk in 60 lines of code using faster_whisper and elevenlabs input streaming☆234Updated 3 months ago
- A ggml (C++) re-implementation of tortoise-tts☆147Updated 3 weeks ago
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆341Updated this week
- llama.cpp with BakLLaVA model describes what does it see☆378Updated 10 months ago
- WavJourney: Compositional Audio Creation with LLMs☆513Updated 11 months ago
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆592Updated 7 months ago
- function calling-based LLM agents☆268Updated this week
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆459Updated this week
- run paligemma in real time☆122Updated 4 months ago