vb000 / LookOnceToHearLinks
A novel human-interaction method for real-time speech extraction on headphones.
☆572Updated last year
Alternatives and similar repositories for LookOnceToHear
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
Sorting:
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆194Updated 4 months ago
- Whisper with Medusa heads☆849Updated this week
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆854Updated 4 months ago
- Real-time audio to chords, lyrics, beat, and melody.☆696Updated 10 months ago
- Pytorch based speech enhancement toolkit.☆338Updated last year
- Joint speech-language model - respond directly to audio!☆371Updated last year
- ☆59Updated 5 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆395Updated last year
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆623Updated 7 months ago
- ☆489Updated last year
- ☆782Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆174Updated 2 months ago
- Auto-AVSR: Lip-Reading Sentences Project☆359Updated 6 months ago
- ☆369Updated 10 months ago
- ☆259Updated last year
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆657Updated 9 months ago
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆357Updated last year
- Unified automatic quality assessment for speech, music, and sound.☆531Updated last month
- This is the audio sample repository for speech separation model "MossFormer2".☆132Updated 7 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆775Updated 11 months ago
- A real-time silent speech recognition tool.☆515Updated 5 months ago
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆577Updated 2 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆485Updated 4 months ago
- A transformer-based network model for pitch detection☆166Updated last year
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,616Updated 11 months ago
- first base model for full-duplex conversational audio☆1,746Updated 6 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆502Updated last year
- Collection of Open Source Speech Data☆159Updated 8 months ago
- On-device voice activity detection (VAD) powered by deep learning☆219Updated this week
- Efficient approach to speaker diarization using voice characteristics extraction☆97Updated 3 weeks ago