vb000 / LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
☆563Updated 10 months ago
Alternatives and similar repositories for LookOnceToHear:
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆491Updated last year
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆187Updated last month
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆832Updated last month
- Joint speech-language model - respond directly to audio!☆368Updated 9 months ago
- first base model for full-duplex conversational audio☆1,731Updated 3 months ago
- Pytorch based speech enhancement toolkit.☆336Updated last year
- ☆658Updated this week
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆231Updated 7 months ago
- A transformer-based network model for pitch detection☆164Updated last year
- Real-time audio to chords, lyrics, beat, and melody.☆689Updated 8 months ago
- Whisper with Medusa heads☆830Updated last month
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆649Updated 6 months ago
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆347Updated last year
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆762Updated 8 months ago
- Real-time binaural target sound extraction model.☆83Updated last year
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆371Updated last year
- ☆394Updated last year
- ☆53Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆160Updated 3 weeks ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,597Updated 8 months ago
- A vocal pitch correction web application (like Autotune)☆312Updated 2 years ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆555Updated 5 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆462Updated last month
- Auto-AVSR: Lip-Reading Sentences Project☆329Updated 3 months ago
- A real-time silent speech recognition tool.☆485Updated 2 months ago
- An implementation of bucketMul LLM inference☆216Updated 9 months ago
- ☆255Updated last year
- ☆282Updated 10 months ago
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch☆268Updated last year
- Efficient approach to speaker diarization using voice characteristics extraction☆92Updated 11 months ago