douglas125 / SpeechIdentityLinks
Identity verification from speech
☆19Updated 2 years ago
Alternatives and similar repositories for SpeechIdentity
Users that are interested in SpeechIdentity are comparing it to the libraries listed below
Sorting:
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last week
- Speaker diarization model☆27Updated 2 years ago
- Efficient approach to speaker diarization using voice characteristics extraction☆94Updated last year
- ☆294Updated 11 months ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆132Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 3 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆230Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆170Updated last month
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- [INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark☆249Updated 2 months ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆136Updated 3 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆90Updated 4 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆168Updated last month
- On-device speaker recognition engine powered by deep learning☆35Updated this week
- This is the audio sample repository for speech separation model "MossFormer2".☆129Updated 6 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- On-device speaker diarization powered by deep learning☆46Updated 3 weeks ago
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆217Updated this week
- Deep Learning - one shot learning for speaker recognition using Filter Banks☆168Updated 11 months ago
- Spot the conversation: speaker diarisation in the wild☆140Updated 2 years ago
- Create an LJSpeech structured voice dataset on wave input☆30Updated 8 months ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆166Updated last month
- On-device voice activity detection (VAD) powered by deep learning☆217Updated this week
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆27Updated 10 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆51Updated last week
- This project is about performing Speaker diarization for Hindi Language.☆50Updated 4 years ago
- ☆257Updated last year
- Voice Activity Projection Models: Self-supervised learning of Turn-taking Events☆66Updated last year
- The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"☆163Updated 6 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated last year