themanyone / whisper_dictation
Fast! Offline, privacy-focused, hands-free voice typing, 2-way AI voice chat, AI images, webcam, recorder, voice control, in under 4 GiB of VRAM.
β157Updated this week
Related projects: β
- State-of-the-art voice typing in Linux terminal (or WFL sesson on Windows.) with a simple bash script. Works with X. Does not require X.β58Updated 7 months ago
- π¬π A small dictation app using OpenAI's Whisper speech recognition model.β304Updated 3 weeks ago
- The fastest Whisper optimization for automatic speech recognition as a command-line interface β‘οΈβ308Updated 3 months ago
- IRIS: Intelligent Residential Integration System - a mind for your home!β61Updated 9 months ago
- Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 β¦β128Updated 3 weeks ago
- Use local llama LLM or openai to chat, discuss/summarize your documents, youtube videos, and so on.β149Updated 4 months ago
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts withβ¦β134Updated 3 weeks ago
- API server for Instant voice cloning by MyShell.β59Updated 4 months ago
- Shush is an app that deploys a WhisperV3 model with Flash Attention v2 on Modal and makes requests to it via a NextJS appβ171Updated 3 months ago
- Low latency ai companion voice talk in 60 lines of code using faster_whisper and elevenlabs input streamingβ234Updated 3 months ago
- Command Your World with Voiceβ368Updated 3 weeks ago
- Made slight modifications to the Tortoise API, provided 3 additional scripts to make using Tortoise easier. Less focus on cloning makes sβ¦β47Updated 4 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.β103Updated last year
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engineβ276Updated 3 weeks ago
- XTTSv2 Extension for oobabooga text-generation-webuiβ143Updated 9 months ago
- AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI modelsβ131Updated 4 months ago
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.β53Updated last month
- web based editor for subtitles and transcriptsβ102Updated last month
- π¬ ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.β188Updated last month
- Offline voice input panel & keyboard with punctuation for Android.β84Updated 3 months ago
- An autonomous AI agent extension for Oobabooga's web uiβ175Updated last year
- β62Updated 4 months ago
- Site for sharing Bark voicesβ47Updated 2 months ago
- A curated list of awesome OpenAI's Whisperβ91Updated last year
- β206Updated this week
- A local AI companion that uses a collection of free, open source AI models in order to create two virtual companions that will follow youβ¦β67Updated 3 weeks ago
- Efficient approach to speaker diarization using voice characteristics extractionβ56Updated 4 months ago
- Modern Desktop Application offering a suite of tools for audio/video text recognition and a variety of other useful utilities.β43Updated last month
- Input a YouTube video link or upload a video file and get a video with subtitles.β93Updated 3 weeks ago
- β38Updated this week