BUTSpeechFIT / DiariZen
A toolkit for speaker diarization.
☆140Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for DiariZen
- We Speech Transcript based on LLM, in 300 lines of code.☆126Updated 2 months ago
- A lightweight end-to-end text-to-speech model☆90Updated last month
- Open source inference code for Rev's model☆331Updated 2 weeks ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆252Updated 2 months ago
- Ultimate Vocal Remover 5 with Gradio UI. Separate an audio file into various stems, using multiple models☆203Updated this week
- ☆165Updated last month
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆219Updated 2 months ago
- Speech Diarization for scrum automation☆97Updated last year
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆73Updated last month
- Nendo is an open source platform for AI-driven audio management, intelligence, and generation.☆117Updated 7 months ago
- RealSI: Open Benchmark for Simultaneous Interpretation in Real-world Scenarios☆44Updated 3 months ago
- Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.☆359Updated this week
- o1-like Chain of Thoughts on claude-3-5-sonnet!☆68Updated last month
- Pseudo Streaming SenseVoice with Hotwords☆75Updated last week
- Have a natural voice conversation with an LLM☆222Updated this week
- Gradio-powered application that converts audio recordings of meetings into transcripts and provides concise summaries using whisper.☆63Updated last month
- Interface for OuteTTS models.☆317Updated this week
- Text to speech alignment using CTC forced alignment☆130Updated last week
- Port of Funasr's Sense-voice model in C/C++☆158Updated 2 weeks ago
- This project provides a RESTful API for converting text to speech using Microsoft's Azure Cognitive Services☆91Updated 5 months ago
- MaskGCT demo page☆12Updated 2 weeks ago
- Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover, Transcription, Text-to-Speech,…☆374Updated this week
- 基于深度学习的语音增强工具(Speech Enhancement Tools Based on Deep Learning)☆116Updated last year
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆362Updated last month
- An Open-Sourced LLM-empowered Foundation TTS System☆424Updated 3 weeks ago
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English☆71Updated this week
- AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model☆53Updated last week
- Engaging in conversation with ChatGPT using voice.☆26Updated 9 months ago
- ChatTTS HTTP API☆48Updated 5 months ago