douglas125 / SpeechIdentityLinks
Identity verification from speech
☆19Updated 3 years ago
Alternatives and similar repositories for SpeechIdentity
Users that are interested in SpeechIdentity are comparing it to the libraries listed below
Sorting:
- Efficient approach to speaker diarization using voice characteristics extraction☆100Updated 3 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆68Updated 2 weeks ago
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆338Updated 2 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆403Updated last year
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆69Updated 2 months ago
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆252Updated last year
- Speaker diarization model☆28Updated 2 years ago
- ☆308Updated last year
- ☆169Updated 9 months ago
- ☆377Updated last year
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆208Updated 5 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆97Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆164Updated last year
- Auto-AVSR: Lip-Reading Sentences Project☆376Updated 8 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated last year
- Speaker Diarization with Transformers☆69Updated 3 months ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆368Updated last year
- Real-time Speech-Text Foundation Model Toolkit (wip)☆246Updated 6 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated last week
- Deep Learning - one shot learning for speaker recognition using Filter Banks☆170Updated last year
- Collection of Open Source Speech Data☆160Updated last week
- Fine Tune the Style-TTS2 Voice Model☆252Updated 3 months ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆181Updated last month
- ONNX Inference of Pyannote Segmentation☆93Updated 9 months ago
- The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines …☆62Updated 3 years ago
- SoTA open-source TTS☆87Updated 3 months ago
- 😎 Awesome lists about Speech Emotion Recognition☆96Updated 9 months ago
- ☆274Updated last year
- Official Implementation of StyleTTS☆448Updated 8 months ago
- A testing repo to share code and thoughts on diarisation☆56Updated last year