lifeiteng / OmniSenseVoice
Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―
β799Updated last month
Alternatives and similar repositories for OmniSenseVoice:
Users that are interested in OmniSenseVoice are comparing it to the libraries listed below
- Examples for Cerebrium Serverless GPUsβ461Updated this week
- Whisper with Medusa headsβ822Updated last week
- first base model for full-duplex conversational audioβ1,707Updated last month
- Open source inference code for Rev's modelβ377Updated last month
- Interface for OuteTTS models.β926Updated last week
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ748Updated 6 months ago
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing anβ¦β828Updated 4 months ago
- Local SRT/LLM/TTS Voicechatβ620Updated 4 months ago
- Visualise your CSV files in seconds without sending your data anywhereβ493Updated last month
- Local realtime voice AIβ2,230Updated this week
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,580Updated 6 months ago
- A Fast TTS Engineβ451Updated 3 weeks ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.β232Updated 5 months ago
- Implementation of F5-TTS in MLXβ477Updated 2 weeks ago
- StreamSpeech is an βAll in Oneβ seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.β1,027Updated 5 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ584Updated 2 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β2,809Updated 3 months ago
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.β646Updated 9 months ago
- β434Updated 5 months ago
- A toolkit for speaker diarization.β172Updated 3 months ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".