nyrahealth / CrisperWhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
☆163Updated last week
Related projects: ⓘ
- A lightweight end-to-end text-to-speech model☆79Updated this week
- zero-shot voice conversion with in context learning☆135Updated this week
- We Speech Transcript based on LLM, in 300 lines of code.☆117Updated last month
- Speech Diarization for scrum automation☆94Updated last year
- Nendo is an open source platform for AI-driven audio management, intelligence, and generation.☆116Updated 5 months ago
- Have a natural voice conversation with an LLM☆189Updated this week
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆169Updated 3 weeks ago
- It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。☆90Updated last week
- Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.☆298Updated 2 months ago
- Live-Transcription (STT) with Whisper PoC☆140Updated 3 months ago
- ☆166Updated 9 months ago
- Llama3.1 learns to Listen☆134Updated this week
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆78Updated 6 months ago
- Pandrator aspires to be a user-friendly app with a graphical interface and a one-click installer that creates high-quality speech from te…☆277Updated last week
- ☆244Updated 6 months ago
- ☆272Updated 2 weeks ago
- OpenAI API and Whisper based Video Translation☆66Updated 5 months ago
- ⚡ Insanely fast AI voice assistant with <500ms response times☆223Updated 2 weeks ago
- video to video translation with voice clone and lip synchronization|带有语音克隆和口型同步的视频翻译,支持中英互换☆97Updated 4 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆81Updated 4 months ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆97Updated 7 months ago
- Local SRT/LLM/TTS Voicechat☆471Updated last month
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime