hedrergudene / asr-sd-pipeline
Speech recognition & diarisation solution with text alignment, deployed in AML pipelines
β84Updated 6 months ago
Related projects β
Alternatives and complementary repositories for asr-sd-pipeline
- Efficient approach to speaker diarization using voice characteristics extractionβ68Updated 6 months ago
- π¬ ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.β196Updated 3 weeks ago
- Whisper realtime streaming for long speech-to-text transcription and translationβ103Updated 9 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restorationβ84Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β45Updated 2 weeks ago
- β152Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generationβ129Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β133Updated last year
- streaming speech to text server using Whisperβ83Updated last year
- A curated list of awesome OpenAI's Whisperβ93Updated last year
- Real-Time Whisper Voice Recognition with vosk model feedback.β105Updated last year
- ez audio transcription tool with flexible processing and post-processing optionsβ130Updated 9 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)β54Updated 5 months ago
- β87Updated 6 months ago
- Speaker Diarization with Transformersβ59Updated 6 months ago
- On-device voice activity detection (VAD) powered by deep learningβ179Updated this week
- Joint speech-language model - respond directly to audio!β30Updated 6 months ago
- β176Updated last month
- β256Updated 5 months ago
- ONNX Inference of Pyannote Segmentationβ66Updated 2 months ago
- Transcription with speaker diarization pipelineβ86Updated last year
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts withβ¦β157Updated last month
- β253Updated 8 months ago
- On-device streaming text-to-speech engine powered by deep learningβ56Updated 2 weeks ago
- web based editor for subtitles and transcriptsβ112Updated 3 months ago
- Live-Transcription (STT) with Whisper PoCβ155Updated 5 months ago
- Faster Tortoise inference then Tortoise Fast Forkβ122Updated 7 months ago
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ138Updated 4 months ago
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.β53Updated 10 months ago
- Whisper combined with Silero VAD, for improved long-form transcriptionsβ44Updated last year