vb000 / LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
☆554Updated 8 months ago
Alternatives and similar repositories for LookOnceToHear:
Users that are interested in LookOnceToHear are comparing it to the libraries listed below
- Real-time audio to chords, lyrics, beat, and melody.☆685Updated 6 months ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆799Updated last month
- A transformer-based network model for pitch detection☆164Updated last year
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆178Updated 2 months ago
- Auto-AVSR: Lip-Reading Sentences Project☆310Updated last month
- Joint speech-language model - respond directly to audio!☆365Updated 7 months ago
- Pytorch based speech enhancement toolkit.☆334Updated 11 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,580Updated 6 months ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆482Updated 3 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆748Updated 6 months ago
- first base model for full-duplex conversational audio☆1,707Updated last month
- The Open Source Code of UniAudio☆543Updated 7 months ago
- Whisper with Medusa heads☆822Updated last week
- A vocal pitch correction web application (like Autotune)☆309Updated 2 years ago
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆344Updated last year
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆476Updated last year
- turnkey self-hosted offline transcription and diarization service with llm summary☆809Updated 4 months ago
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆639Updated 4 months ago
- The Sol Mate GPT but on your e-Paper display!☆311Updated 3 months ago
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆522Updated 3 months ago
- OpenCV+YOLO+LLAVA powered video surveillance system☆742Updated this week
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆355Updated last year
- This is a python implementation for stitching images.☆234Updated 4 months ago
- Real-time binaural target sound extraction model.☆80Updated 10 months ago
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.☆1,297Updated 7 months ago
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.☆575Updated last year
- Handwriting synthesis with Harfbuzz WASM.☆457Updated 6 months ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆781Updated 3 months ago