badgids / transcription-app
a transcription application that listens to audio input from the microphone using OpenAI's Whisper, transcribes it into text, and simulates typing the transcription in real-time wherever your cursor is on the screen. It can also do realtime translation.
☆15Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for transcription-app
- VoiceCraftAI is a revolutionary AI tool to dub videos into multiple regional languages and lip-sync at the same time.☆46Updated last month
- ☆51Updated 2 months ago
- A TTS extension for oobabooga text WebUI☆26Updated 6 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆68Updated 6 months ago
- A simple extension that uses Bark Text-to-Speech for audio output☆35Updated last year
- ☆77Updated 4 months ago
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Updated 8 months ago
- Text-to-Music Generation with Rectified Flow Transformer☆48Updated 2 months ago
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆73Updated last week
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated 3 weeks ago
- llmon-py is a multimodal webui for Llama 3-8B.☆15Updated 4 months ago
- Open dubbing is an AI dubbing system which uses machine learning models to automatically translate and synchronize audio dialogue into di…☆63Updated this week
- Diffusion_TTS extension for booga☆63Updated 4 months ago
- Simulates talk with an AI that can express emotions☆31Updated 3 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆71Updated last month
- AI 3D avatar voice interface in browser. VAD -> STT -> LLM -> TTS -> VRM (Prototype/Proof-of-Concept)☆64Updated last year
- ☆26Updated 11 months ago
- ☆68Updated 8 months ago
- ☆40Updated 7 months ago
- A plugin for Oobabooga TextUI that allows you to search multiple search engines. Initially we're using Google API or DuckDuckGo.☆16Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated 2 weeks ago
- ☆32Updated 2 weeks ago
- Modern Desktop Application offering a suite of tools for audio/video text recognition and a variety of other useful utilities.☆43Updated 3 months ago
- A library for real-time Speech to Text (STT), and Text to Speech (TTS) capability☆30Updated 11 months ago
- Turn text from websites into spoken audio with edge-tts, F5, etc. and save as mp3 files☆29Updated this week
- A web search extension for Oobabooga's text-generation-webui (now with nougat)☆64Updated 4 months ago
- Listen, transcribe, reply - Voice Assistant using OpenAI & ElevenLabs API's☆14Updated last year
- ☆15Updated last year
- A very simple implementation of edge_tts w/ RVC for oobabooga text-generation-webui.☆41Updated 9 months ago
- A Lightweight Gradio Web interface for Text-to-Audio Generation utilising SAO1.0☆46Updated 5 months ago