lifeiteng / OmniSenseVoiceLinks
Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―
β851Updated 3 months ago
Alternatives and similar repositories for OmniSenseVoice
Users that are interested in OmniSenseVoice are comparing it to the libraries listed below
Sorting:
- Whisper with Medusa headsβ842Updated 3 weeks ago
- Open source inference code for Rev's modelβ404Updated 2 months ago
- first base model for full-duplex conversational audioβ1,749Updated 5 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,613Updated 10 months ago
- Examples for Cerebrium Serverless GPUsβ489Updated last week
- Local SRT/LLM/TTS Voicechatβ692Updated 8 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ773Updated 10 months ago
- β750Updated 2 months ago
- Interface for OuteTTS models.β1,304Updated 3 weeks ago
- Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, spβ¦β370Updated 3 weeks ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.β233Updated 9 months ago
- Implementation of F5-TTS in MLXβ554Updated 3 months ago
- StreamSpeech is an βAll in Oneβ seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.β1,091Updated this week
- Visualise your CSV files in seconds without sending your data anywhereβ510Updated last week
- Local realtime voice AIβ2,328Updated 3 months ago
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.β724Updated last year
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ741Updated 2 weeks ago
- A Fast TTS Engineβ514Updated 4 months ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesisβ577Updated 2 months ago
- Real-time audio to chords, lyrics, beat, and melody.β693Updated 10 months ago
- Web scraper made for AI and simplicity in mind. It runs as a CLI that can be parallelized and outputs high-quality markdown content.β519Updated 2 weeks ago
- β423Updated last month
- β1,134Updated 4 months ago
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)β651Updated last month
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"β190Updated 3 months ago
- A lightweight end-to-end text-to-speech modelβ114Updated 3 months ago
- TEN VAD: low-latency high-performance Voice Activity Detectorβ540Updated 2 weeks ago
- A toolkit for speaker diarization.β203Updated 2 weeks ago
- G2Pβ258Updated last month
- Fast and accurate automatic speech recognition (ASR) for edge devicesβ2,757Updated last month