Real Time Speech Transcription with FastRTC ⚡️and Local Whisper 🤗
☆701Jul 10, 2025Updated 8 months ago
Alternatives and similar repositories for realtime-transcription-fastrtc
Users that are interested in realtime-transcription-fastrtc are comparing it to the libraries listed below
Sorting:
- The python library for real-time communication☆4,543Jan 12, 2026Updated last month
- Towards Human-Sounding Speech☆5,983Dec 5, 2025Updated 3 months ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆350Apr 10, 2025Updated 11 months ago
- Local realtime voice AI☆2,438Nov 26, 2025Updated 3 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆4,486Updated this week
- Oliva Multi-Agent Assistant☆387Apr 11, 2025Updated 10 months ago
- Interface for OuteTTS models.☆1,427Jun 21, 2025Updated 8 months ago
- ☆8,826Oct 25, 2025Updated 4 months ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆3,546Nov 12, 2025Updated 3 months ago
- A Conversational Speech Generation Model☆14,530May 27, 2025Updated 9 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆9,799Updated this week
- Open Source framework for voice and multimodal conversational AI☆10,529Mar 3, 2026Updated last week
- Official inference framework for 1-bit LLMs☆28,697Feb 3, 2026Updated last month
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,410Apr 15, 2025Updated 10 months ago
- Build datasets using natural language☆568Sep 19, 2025Updated 5 months ago
- Real-Time Voice Inference Web SDK☆304Mar 3, 2026Updated last week
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard though☆568Nov 20, 2025Updated 3 months ago
- YT Navigator: AI-powered YouTube content explorer that lets you search and chat with channel videos using AI agents. Extract insights fro…☆580Mar 27, 2025Updated 11 months ago
- A fast multimodal LLM for real-time voice☆4,368Dec 12, 2025Updated 2 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆20,556Feb 22, 2026Updated 2 weeks ago
- AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation☆4,601Dec 23, 2025Updated 2 months ago
- A Python package that makes it easy for developers to create AI apps powered by various AI providers.☆1,646Apr 8, 2025Updated 11 months ago
- Inference and training library for high-quality TTS models.☆5,547Dec 10, 2024Updated last year
- Faster Whisper transcription with CTranslate2☆21,289Nov 19, 2025Updated 3 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆4,049Jan 8, 2025Updated last year
- A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.☆3,811Mar 2, 2026Updated last week
- ☆1,356Mar 3, 2026Updated last week
- ☆1,292Jan 29, 2026Updated last month
- Fast State-of-the-Art Static Embeddings☆2,007Feb 28, 2026Updated last week
- Build, run, manage agentic software at scale.☆38,516Updated this week
- Get your documents ready for gen AI☆54,754Mar 3, 2026Updated last week
- Whisper with Medusa heads☆865Aug 6, 2025Updated 7 months ago
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.☆53,029Mar 3, 2026Updated last week
- An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Co…☆6,079Dec 9, 2025Updated 3 months ago
- ☆11Dec 23, 2023Updated 2 years ago
- A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) within reach of everyone, particu…☆38Jan 7, 2024Updated 2 years ago
- Useful resources for LLM-based Diarization and Transcription.☆55Oct 15, 2024Updated last year
- 🤗 smolagents: a barebones library for agents that think in code.☆25,756Mar 1, 2026Updated last week
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,660Mar 2, 2026Updated last week