mikeesto / gemini-transcribeLinks
Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash
☆42Updated last week
Alternatives and similar repositories for gemini-transcribe
Users that are interested in gemini-transcribe are comparing it to the libraries listed below
Sorting:
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆52Updated last year
- ☆314Updated 3 months ago
- Very fast, accurate speaker diarization☆186Updated this week
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆141Updated 8 months ago
- A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!☆57Updated 2 weeks ago
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆20Updated last month
- Open source Python program for automating gain staging. part 1 of a series for automating audio processing tasks, end goal is to create a…☆45Updated 2 years ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆69Updated last month
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆129Updated 4 months ago
- Open TTS models, built for streaming on the edge☆44Updated 8 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆105Updated 5 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆10Updated last year
- Speaker diarization service☆25Updated 5 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆30Updated last year
- ☆54Updated 6 months ago
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with…☆244Updated 3 months ago
- ☆217Updated last month
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆99Updated last year
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆49Updated 10 months ago
- Tool for automatic transcription and speaker diarization based on whisper and pyannote.☆62Updated 10 months ago
- Roomey is a multi-purpose Voice Agent designed to run your personal and business life.☆51Updated 5 months ago
- Turn text from websites into spoken audio with edge-tts, F5, etc. and save as mp3 files☆46Updated 5 months ago
- A highly optimized engine for maya-1 tts model to generate minutes of audio in seconds.☆49Updated 3 weeks ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆88Updated last week
- Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.☆29Updated 3 weeks ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆31Updated 7 months ago
- 💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... fast!☆71Updated last week
- Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. …☆37Updated last year
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆289Updated 2 months ago
- Add real-time Speech-to-Text to your LiveKit application with AssemblyAI☆18Updated 6 months ago