vb000 / LookOnceToHearLinks
A novel human-interaction method for real-time speech extraction on headphones.
☆596Updated last year
Alternatives and similar repositories for LookOnceToHear
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
Sorting:
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆884Updated last month
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆200Updated 10 months ago
- Real-time audio to chords, lyrics, beat, and melody.☆713Updated last year
- Pytorch based speech enhancement toolkit.☆337Updated last year
- Joint speech-language model - respond directly to audio!☆371Updated last year
- OpenCV+YOLO+LLAVA powered video surveillance system☆779Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆194Updated 8 months ago
- A transformer-based network model for pitch detection☆166Updated 5 months ago
- first base model for full-duplex conversational audio☆1,769Updated last year
- Whisper with Medusa heads☆863Updated 5 months ago
- ☆496Updated last year
- A simple "Be My Eyes" web app with a llama.cpp/llava backend☆493Updated 2 years ago
- ☆1,205Updated last week
- ☆65Updated 11 months ago
- Mistral7B playing DOOM☆138Updated last year
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆785Updated last year
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆412Updated last year
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆532Updated 2 years ago
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆367Updated 2 years ago
- A vocal pitch correction web application (like Autotune)☆322Updated 2 years ago
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆672Updated 3 weeks ago
- Auto-AVSR: Lip-Reading Sentences Project☆400Updated last year
- ☆175Updated 2 months ago
- On-device voice activity detection (VAD) powered by deep learning☆241Updated last week
- This is a python implementation for stitching images.☆233Updated last year
- Performant and accurate speech recognition built on Pytorch☆254Updated 3 years ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆740Updated last year
- Improving transcription performance of OpenAI Whisper for CPU based deployment☆257Updated 3 years ago
- ☆275Updated last year
- ☆259Updated last year