Standard-Intelligence / hertz-dev
first base model for full-duplex conversational audio
☆1,362Updated this week
Related projects ⓘ
Alternatives and complementary repositories for hertz-dev
- Llama3.1 learns to Listen☆1,749Updated last week
- Whisper with Medusa heads☆800Updated last week
- A fast multimodal LLM for real-time voice☆980Updated this week
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆699Updated this week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,538Updated last month
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,544Updated 3 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆714Updated 3 months ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆755Updated 2 weeks ago
- Local SRT/LLM/TTS Voicechat☆535Updated last month
- Interface for OuteTTS models.☆317Updated this week
- ☆1,092Updated 4 months ago
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,107Updated last week
- ☆443Updated this week
- An Open Source text-to-speech system built by inverting Whisper.☆3,956Updated 4 months ago
- Open Source framework for voice and multimodal conversational AI☆3,346Updated this week
- A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech mode…☆783Updated last week
- Implementation of F5-TTS in MLX☆311Updated last week
- ☆717Updated this week
- Controllable and fast Text-to-Speech for over 7000 languages!☆1,448Updated this week
- Example UI implementing the RTVI web client☆471Updated last month
- Joint speech-language model - respond directly to audio!☆355Updated 4 months ago
- Open source inference code for Rev's model☆331Updated 2 weeks ago
- Inference and training library for high-quality TTS models.☆4,592Updated last week
- Convert any PDF into a podcast episode!☆1,438Updated last week
- Converts text to speech in realtime☆1,989Updated this week
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.☆578Updated 6 months ago
- ⚡ Insanely fast AI voice assistant with <500ms response times☆302Updated 2 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆3,499Updated last week
- open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…☆3,067Updated last week
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs☆1,462Updated 2 weeks ago