BUTSpeechFIT / DiariZen
A toolkit for speaker diarization.
☆165Updated 2 months ago
Alternatives and similar repositories for DiariZen:
Users that are interested in DiariZen are comparing it to the libraries listed below
- We Speech Transcript based on LLM, in 300 lines of code.☆142Updated this week
- A lightweight end-to-end text-to-speech model☆99Updated last month
- ☆152Updated 2 months ago
- Open source inference code for Rev's model☆364Updated last week
- Speech Diarization for scrum automation☆101Updated last year
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆229Updated 5 months ago
- Sample Repository for the AlibabaCloud Bailian Speech SDK☆67Updated this week
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆80Updated 4 months ago
- ☆186Updated 4 months ago
- ☆139Updated 2 months ago
- Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.☆151Updated 2 months ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆92Updated last month
- InspireMusic: A Unified Framework for Music, Song, Audio Generation.☆339Updated this week
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆137Updated last month
- Nendo is an open source platform for AI-driven audio management, intelligence, and generation.☆119Updated 10 months ago
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.☆84Updated last week
- Target Speaker Extraction Toolkit☆141Updated this week
- Running the F5-TTS by ONNX Runtime☆91Updated this week
- RealSI: Open Benchmark for Simultaneous Interpretation in Real-world Scenarios☆49Updated 2 months ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆381Updated 4 months ago
- o1-like Chain of Thoughts on claude-3-5-sonnet!☆76Updated 4 months ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆71Updated last year
- This is the audio sample repository for speech separation model "MossFormer2".☆120Updated 2 months ago
- Text to speech alignment using CTC forced alignment☆206Updated last week
- 基于FunASR实现语音识别,包含常规版和ONNX版(推荐)。☆28Updated 3 months ago
- Huawei Grad-TTS for Chinese☆46Updated last year
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆120Updated last year
- ONNX Inference of Pyannote Segmentation☆81Updated last month