lifeiteng / OmniSenseVoice
Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―
β699Updated this week
Related projects β
Alternatives and complementary repositories for OmniSenseVoice
- first base model for full-duplex conversational audioβ1,362Updated this week
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ714Updated 3 months ago
- Whisper with Medusa headsβ800Updated last week
- Open source inference code for Rev's modelβ331Updated 2 weeks ago
- Interface for OuteTTS models.β317Updated this week
- β443Updated this week
- Local SRT/LLM/TTS Voicechatβ535Updated last month
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing anβ¦β688Updated last month
- A fast multimodal LLM for real-time voiceβ980Updated this week
- Implementation of F5-TTS in MLXβ311Updated last week
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,544Updated 3 months ago
- An API to transcribe audio with OpenAI's Whisper Large v3!β185Updated 2 months ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.β219Updated 2 months ago
- Llama3.1 learns to Listenβ1,749Updated last week
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.β578Updated 6 months ago
- Real-time audio to chords, lyrics, beat, and melody.β667Updated 2 months ago
- π¦ CHONK your texts with Chonkie β¨ - The no-nonsense RAG chunking libraryβ789Updated this week
- Fast and accurate automatic speech recognition (ASR) for edge devicesβ2,107Updated last week
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ252Updated 2 months ago
- β‘ Insanely fast AI voice assistant with <500ms response timesβ302Updated 2 months ago
- turnkey self-hosted offline transcription and diarization service with llm summaryβ733Updated last month
- Export your personal data in one clickβ935Updated this week
- OpenCV+YOLO+LLAVA powered video surveillance systemβ687Updated 3 weeks ago
- With one command, create a natural-sounding audiobook from a variety of input formats (epub, mobi, txt, PDF, HTML and more!)β571Updated 3 weeks ago
- An open source voice-enabled, compact, empathic AI hardware + software π€ framework for companionship, entertainment, education, pediatriβ¦β399Updated this week
- Convert any PDF into a podcast episode!β572Updated 3 weeks ago
- Web scraper made for AI and simplicity in mind. It runs as a CLI that can be parallelized and outputs high-quality markdown content.β486Updated this week
- β717Updated this week
- ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3β453Updated 2 months ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".β755Updated 2 weeks ago