vb000 / LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
☆558Updated 9 months ago
Alternatives and similar repositories for LookOnceToHear:
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆826Updated 3 weeks ago
- Real-time audio to chords, lyrics, beat, and melody.☆687Updated 7 months ago
- ☆609Updated 2 weeks ago
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆180Updated last month
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆758Updated 7 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆365Updated last year
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆531Updated 4 months ago
- Pytorch based speech enhancement toolkit.☆336Updated last year
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆157Updated last week
- An implementation of bucketMul LLM inference☆215Updated 9 months ago
- ☆279Updated 9 months ago
- ☆352Updated 6 months ago
- Whisper with Medusa heads☆824Updated last month
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆232Updated 7 months ago
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆535Updated this week
- Local realtime voice AI☆2,264Updated 3 weeks ago
- Joint speech-language model - respond directly to audio!☆369Updated 9 months ago
- On-device voice activity detection (VAD) powered by deep learning☆203Updated this week
- first base model for full-duplex conversational audio☆1,725Updated 2 months ago
- A transformer-based network model for pitch detection☆166Updated last year
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆482Updated last year
- A real-time silent speech recognition tool.☆478Updated last month
- PyTorch implementation of Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities.☆427Updated last week
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆345Updated last year
- G2P☆182Updated this week
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆646Updated 6 months ago
- Unified automatic quality assessment for speech, music, and sound.☆434Updated this week
- ☆393Updated last year
- Auto-AVSR: Lip-Reading Sentences Project☆323Updated 2 months ago
- The Open Source Code of UniAudio☆551Updated 8 months ago