themanyone / caption_anythingLinks
Caption, translate, and optionally record in real time "what you hear" from speakers and microphone. Never miss part of the conversation again.
☆20Updated last month
Alternatives and similar repositories for caption_anything
Users that are interested in caption_anything are comparing it to the libraries listed below
Sorting:
- Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.☆269Updated last month
- This is a Raspberry Pi 5 whisper C++ voice assistant - backwards compatible with Pi4☆24Updated last year
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.☆67Updated last year
- streaming speech to text server using Whisper☆95Updated 2 years ago
- A curated list of awesome OpenAI's Whisper☆98Updated 2 years ago
- A lightweight Python library for running TTS models with a unified API.☆21Updated 8 months ago
- ☆31Updated last month
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆97Updated last year
- IRIS: Demonstrator for use of LLMs in python (outdated)☆63Updated 7 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆70Updated 3 months ago
- Speaker diarization service☆24Updated 4 months ago
- Python app for LM Studio-enhanced voice conversations with local LLMs. Uses Whisper for speech-to-text and offers a privacy-focused, acce…☆119Updated last year
- Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash☆40Updated last month
- Whisper from OpenAi and diarization with Pyannote☆49Updated last year
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆54Updated 10 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆119Updated 2 years ago
- A free & open tool for transcribing audio interviews with offline ASR support☆25Updated last year
- On-device noise suppression powered by deep learning☆76Updated 2 months ago
- llmon-py is a multimodal webui for Llama 3-8B.☆16Updated last year
- Offline voice input panel & keyboard with punctuation for Android.☆108Updated last year
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆37Updated 2 years ago
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆155Updated last month
- Open Server is an OpenAI API Compatible Server for generating text, images, embeddings, and storing them in vector databases. It also inc…☆17Updated last year
- web based editor for subtitles and transcripts☆141Updated last year
- faster-whisper as serverless endpoint☆121Updated 5 months ago
- An OpenAI API compatible speech to text server for audio transcription and translations, aka. Whisper.☆88Updated 9 months ago
- TypeScript-based library for real-time audio transcription, integrating OpenAI's Whisper model for accurate speech-to-text conversion.☆71Updated last year
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆217Updated last year
- ☆74Updated last year
- An Extension for oobabooga/text-generation-webui☆36Updated 2 years ago