vb000 / LookOnceToHearLinks
A novel human-interaction method for real-time speech extraction on headphones.
☆566Updated last year
Alternatives and similar repositories for LookOnceToHear
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
Sorting:
- Real-time audio to chords, lyrics, beat, and melody.☆693Updated 10 months ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆851Updated 3 months ago
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆190Updated 3 months ago
- A transformer-based network model for pitch detection☆166Updated last year
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆233Updated 9 months ago
- Pytorch based speech enhancement toolkit.☆338Updated last year
- first base model for full-duplex conversational audio☆1,749Updated 5 months ago
- ☆270Updated last year
- OpenCV+YOLO+LLAVA powered video surveillance system☆761Updated last week
- Joint speech-language model - respond directly to audio!☆369Updated 11 months ago
- Mistral7B playing DOOM☆132Updated 11 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,613Updated 10 months ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.☆582Updated 2 years ago
- Demo of twilio☆274Updated last year
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆386Updated last year
- Whisper with Medusa heads☆842Updated 3 weeks ago
- A ggml (C++) re-implementation of tortoise-tts☆186Updated 10 months ago
- ☆750Updated 2 months ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backend☆489Updated last year
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆773Updated 10 months ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆617Updated 7 months ago
- example free website for client-side music demixing with Demucs + WebAssembly☆347Updated last month
- ☆365Updated 9 months ago
- Auto-AVSR: Lip-Reading Sentences Project☆347Updated 5 months ago
- ☆157Updated last year
- Algebraic enhancements for GEMM & AI accelerators☆277Updated 3 months ago
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆655Updated 8 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆497Updated last year
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆826Updated 7 months ago
- A real-time silent speech recognition tool.☆507Updated 4 months ago