liutaocode / DiarizationVisualizationLinks
Visualization tools for audio-only and multi-modal speaker diarization dataset
☆13Updated 2 years ago
Alternatives and similar repositories for DiarizationVisualization
Users that are interested in DiarizationVisualization are comparing it to the libraries listed below
Sorting:
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆99Updated 11 months ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆159Updated last week
- Official Repository For VoxBlink2☆85Updated last year
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆226Updated last year
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.☆58Updated last year
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆181Updated last year
- This is the audio sample repository for speech separation model "MossFormer2".☆157Updated last year
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago
- Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion☆152Updated 2 years ago
- Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Pr…☆231Updated last year
- Official repository of SepReformer for speech separation☆233Updated 11 months ago
- Target Speaker Extraction Toolkit☆231Updated 2 months ago
- The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"☆182Updated 2 months ago
- Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement☆454Updated 7 months ago
- Easy-to-Use Speech MOS predictors☆336Updated 2 years ago
- ☆140Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆263Updated 11 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆86Updated last year
- This repo provides the processed samples of the manuscript "MossFormer: Pushing the Performance Limit of Monaural Speech Separation using…☆99Updated last year
- S3PRL-VC: A Voice Conversion Toolkit based on S3PRL☆101Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆69Updated 2 years ago
- Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech☆236Updated last year
- Training code for FAcodec presented in NaturalSpeech3☆231Updated last year
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆58Updated 10 months ago
- TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion☆148Updated last year
- Voice gender classifier using ECAPA-TDNN☆62Updated 10 months ago
- Analysis of XLS-R for Speech Quality Assessment☆14Updated 10 months ago
- [INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for …☆167Updated 7 months ago
- ☆69Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year