jingzhunxue / TargetDiarizationLinks
Multi-speaker separation, identification, diarization ALL-IN-ONE. It can isolate the target speaker from a conversation audio and do ASR.
☆61Updated 4 months ago
Alternatives and similar repositories for TargetDiarization
Users that are interested in TargetDiarization are comparing it to the libraries listed below
Sorting:
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Updated 3 years ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Updated 7 months ago
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Updated last year
- A simple command line tool to calculate WER for ASR.☆14Updated last year
- A simple implementation for improving CosyVoice2 by GRPO method☆32Updated 3 months ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆28Updated 2 weeks ago
- semantic tokenizer for speech and music☆21Updated 7 months ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Updated 2 years ago
- Official code of SenSE.☆72Updated 3 months ago
- A neural speech codec based on discrete WavLM representations☆24Updated last year
- Whisper Speech Quality Assessment (WhiSQA)☆16Updated 3 months ago
- faster inference☆28Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Updated last year
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆35Updated 9 months ago
- Extract phoneme-level timestamps from speeh audio.☆116Updated this week
- (WIP)long form speech generatoins☆31Updated 10 months ago
- ☆11Updated 2 years ago
- ☆19Updated last year
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated last year
- ☆21Updated last year
- ☆36Updated 5 months ago
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆64Updated 4 months ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆12Updated last year
- PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…☆16Updated 6 months ago
- ☆11Updated 2 years ago
- A toolkit dedicate for speech evaluation.☆24Updated last year
- ☆14Updated 3 years ago
- FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates☆40Updated 3 months ago
- ☆13Updated 4 months ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated 2 years ago