lifeiteng / OmniSenseVoiceLinks
Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―
β848Updated 2 months ago
Alternatives and similar repositories for OmniSenseVoice
Users that are interested in OmniSenseVoice are comparing it to the libraries listed below
Sorting:
- first base model for full-duplex conversational audioβ1,746Updated 4 months ago
- Whisper with Medusa headsβ838Updated last month
- Open source inference code for Rev's modelβ404Updated last month
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ767Updated 9 months ago
- β737Updated last month
- Examples for Cerebrium Serverless GPUsβ487Updated 2 weeks ago
- Interface for OuteTTS models.β1,294Updated this week
- Implementation of F5-TTS in MLXβ541Updated 2 months ago
- Local realtime voice AIβ2,317Updated 3 months ago
- Local SRT/LLM/TTS Voicechatβ680Updated 7 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,606Updated 10 months ago
- β404Updated 2 weeks ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ723Updated 5 months ago
- A Fast TTS Engineβ502Updated 4 months ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.β233Updated 9 months ago
- An API to transcribe audio with OpenAI's Whisper Large v3!β277Updated 6 months ago
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)β645Updated 2 weeks ago
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing anβ¦β854Updated 8 months ago
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"β189Updated 3 months ago
- A real-time silent speech recognition tool.β502Updated 4 months ago
- Visualise your CSV files in seconds without sending your data anywhereβ509Updated last week
- Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, spβ¦β366Updated this week
- OpenCV+YOLO+LLAVA powered video surveillance systemβ760Updated 3 months ago
- G2Pβ248Updated last month
- turnkey self-hosted offline transcription and diarization service with llm summaryβ854Updated 8 months ago
- Have a natural, spoken conversation with AI!β2,375Updated 2 weeks ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ567Updated last month
- Fast and accurate automatic speech recognition (ASR) for edge devicesβ2,718Updated 3 weeks ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".β908Updated 7 months ago
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.β499Updated 2 weeks ago