vb000 / LookOnceToHearLinks
A novel human-interaction method for real-time speech extraction on headphones.
☆594Updated last year
Alternatives and similar repositories for LookOnceToHear
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
Sorting:
- Real-time audio to chords, lyrics, beat, and melody.☆712Updated last year
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆199Updated 10 months ago
- Pytorch based speech enhancement toolkit.☆337Updated last year
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆880Updated 2 weeks ago
- Whisper with Medusa heads☆864Updated 4 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆411Updated last year
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆655Updated last week
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆368Updated 2 years ago
- A transformer-based network model for pitch detection☆166Updated 4 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆532Updated 2 years ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆193Updated 8 months ago
- Joint speech-language model - respond directly to audio!☆372Updated last year
- ☆65Updated 10 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,643Updated last year
- ☆174Updated last month
- ☆494Updated last year
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆668Updated last year
- ☆414Updated 2 years ago
- ☆382Updated last year
- ☆275Updated last year
- ☆261Updated last year
- ☆1,169Updated 3 weeks ago
- The Open Source Code of UniAudio☆593Updated last year
- first base model for full-duplex conversational audio☆1,771Updated 11 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆785Updated last year
- This is the audio sample repository for speech separation model "MossFormer2".☆157Updated last year
- OpenCV+YOLO+LLAVA powered video surveillance system☆781Updated 2 months ago
- A real-time silent speech recognition tool.☆625Updated last month
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.☆588Updated 2 years ago