vb000 / LookOnceToHearLinks
A novel human-interaction method for real-time speech extraction on headphones.
☆580Updated last year
Alternatives and similar repositories for LookOnceToHear
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
Sorting:
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆862Updated 5 months ago
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆196Updated 5 months ago
- Pytorch based speech enhancement toolkit.☆337Updated last year
- Real-time audio to chords, lyrics, beat, and melody.☆703Updated last year
- Whisper with Medusa heads☆852Updated 2 weeks ago
- Joint speech-language model - respond directly to audio!☆370Updated last year
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆182Updated 4 months ago
- A transformer-based network model for pitch detection☆167Updated 3 weeks ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,626Updated last year
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆402Updated last year
- first base model for full-duplex conversational audio☆1,750Updated 7 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆783Updated last year
- Auto-AVSR: Lip-Reading Sentences Project☆369Updated 7 months ago
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆589Updated last week
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆508Updated last year
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆363Updated last year
- ☆261Updated last year
- On-device voice activity detection (VAD) powered by deep learning☆227Updated last week
- ☆377Updated 11 months ago
- On-device speech-to-text engine powered by deep learning☆459Updated last week
- A real-time silent speech recognition tool.☆533Updated 6 months ago
- Mistral7B playing DOOM☆135Updated last year
- ☆307Updated last year
- Collection of Open Source Speech Data☆159Updated 9 months ago
- Open source inference code for Rev's model☆422Updated 4 months ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.☆583Updated 2 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆834Updated 9 months ago
- ☆491Updated last year
- Efficient approach to speaker diarization using voice characteristics extraction☆98Updated 2 months ago