themanyone / caption_anythingLinks
Caption, translate, and optionally record in real time "what you hear" from speakers and microphone. Never miss part of the conversation again.
☆18Updated 2 weeks ago
Alternatives and similar repositories for caption_anything
Users that are interested in caption_anything are comparing it to the libraries listed below
Sorting:
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.☆64Updated last year
- llmon-py is a multimodal webui for Llama 3-8B.☆16Updated 11 months ago
- OpenAI-Assistant API integration with Speech Recognition and Eleven Labs TTS. User can choose name, description, model of assistant and …☆18Updated last year
- Webinterface for administrating Ollama and model Quantization with public endpoints and automized OPENAI proxy☆50Updated 3 months ago
- streaming speech to text server using Whisper☆93Updated 2 years ago
- A VoiceAsistant with WhisperAI speech recognition☆31Updated 7 months ago
- ☆25Updated last week
- Open Server is an OpenAI API Compatible Server for generating text, images, embeddings, and storing them in vector databases. It also inc…☆16Updated last year
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆128Updated last week
- Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.☆248Updated last week
- IRIS: Demonstrator for use of LLMs in python (outdated)☆62Updated 3 months ago
- Python app for LM Studio-enhanced voice conversations with local LLMs. Uses Whisper for speech-to-text and offers a privacy-focused, acce…☆99Updated last year
- Llama.cui is a small llama.cpp-based chat application for Node.js☆18Updated this week
- Transcription and Diarization based on OpenAI's Whisper☆22Updated last year
- Simple GUI to load a PDF/Docx/txt file and have LM Studio Answer based off of it.☆14Updated 10 months ago
- Simple LLM interface based on terminal.☆11Updated last year
- A simple speech-to-text and text-to-speech AI chatbot that can be run fully offline.☆45Updated last year
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆22Updated 10 months ago
- Prompt Jinja2 templates for LLMs☆31Updated 3 weeks ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆37Updated 2 years ago
- Offline voice input panel & keyboard with punctuation for Android.☆106Updated last year
- A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models☆66Updated 2 years ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆115Updated last year
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Updated last year
- PyGPTPrompt: A CLI tool that manages context windows for AI models, facilitating user interaction and data ingestion for optimized long-t…☆30Updated last year
- Extension for VSCode for Ollama☆25Updated last year
- A free & open tool for transcribing audio interviews with offline ASR support☆24Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆20Updated 8 months ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆52Updated 6 months ago
- Speech-to-text, text-to-speech with ElevenLabs☆28Updated last year