themanyone / caption_anythingLinks
Caption, translate, and optionally record in real time "what you hear" from speakers and microphone. Never miss part of the conversation again.
☆19Updated last month
Alternatives and similar repositories for caption_anything
Users that are interested in caption_anything are comparing it to the libraries listed below
Sorting:
- llmon-py is a multimodal webui for Llama 3-8B.☆16Updated last year
- Python app for LM Studio-enhanced voice conversations with local LLMs. Uses Whisper for speech-to-text and offers a privacy-focused, acce…☆102Updated last year
- Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.☆256Updated last month
- A curated list of awesome OpenAI's Whisper☆101Updated last year
- ☆27Updated 2 weeks ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated 7 months ago
- On-device streaming text-to-speech engine powered by deep learning☆98Updated this week
- A free & open tool for transcribing audio interviews with offline ASR support☆24Updated last year
- This is a Raspberry Pi 5 whisper C++ voice assistant - backwards compatible with Pi4☆24Updated last year
- 💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... fast!☆56Updated last month
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.☆64Updated last year
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆39Updated 10 months ago
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆47Updated 6 months ago
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆133Updated this week
- A lightweight Python library for running TTS models with a unified API.☆20Updated 5 months ago
- An open-source, browser-based transcript viewer and manager. Upload, transcribe, and chat with meeting recordings using AI. Features meet…☆55Updated 2 months ago
- AI-Powered Podcast Generator: A Python-based tool that converts text scripts into realistic audio podcasts using Google's Generative AI A…☆40Updated 7 months ago
- Effortlessly record, transcribe, and summarize meetings with this user-friendly desktop utility powered by OpenAI's Whisper and GPT-3.5-t…☆186Updated 2 years ago
- Simple GUI to load a PDF/Docx/txt file and have LM Studio Answer based off of it.☆14Updated 11 months ago
- Webinterface for administrating Ollama and model Quantization with public endpoints and automized OPENAI proxy☆50Updated 4 months ago
- Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash☆33Updated 3 weeks ago
- Modern Desktop Application offering a suite of tools for audio/video text recognition and a variety of other useful utilities.☆56Updated 11 months ago
- IRIS: Demonstrator for use of LLMs in python (outdated)☆62Updated 4 months ago
- A bash script using OpenAI Whisper API for continuous audio transcription with automatic silence detection☆111Updated last year
- 💬📝 A small dictation app using OpenAI's Whisper speech recognition model.☆10Updated 10 months ago
- OpenAI-Assistant API integration with Speech Recognition and Eleven Labs TTS. User can choose name, description, model of assistant and …☆18Updated last year
- Speaker diarization service☆23Updated last month
- How might we mix OpenAI and Langchain and ElevenLabs to speak out responses to prompts using a body of knowledge encapsulated in PDFs?☆38Updated 2 years ago
- Coqui AI TTS plugin☆85Updated 3 weeks ago
- ☆55Updated 2 weeks ago