PranavPutsa1006 / Speaker-Diarization
Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python
☆16Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Speaker-Diarization
- Audio processing using deep neural networks. Speaker identification using voice embeddings.☆13Updated last year
- Speaker diarization service☆19Updated this week
- 🎹 pyannote + 🗒 notebook = pyannotebook☆25Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆71Updated last year
- Keras(Tensorflow) implementations of Automatic Speech Recognition☆22Updated 2 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- OCTRA is a web-application for the orthographic transcription of audio files.☆35Updated this week
- Zero-Shot Foreign Accent Conversion without a Native Reference☆28Updated 6 months ago
- Similarity Learning applied to Speaker Verification and Semantic Textual Similarity☆12Updated 4 years ago
- Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratory☆16Updated 5 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆44Updated 4 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Zero-shot Audio Classification using Whisper☆74Updated last year
- Speakerbox: Fine-tune Audio Transformers for speaker identification.☆52Updated 8 months ago
- A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models☆65Updated 2 years ago
- OpenAI Whisper Prompt Examples☆48Updated last year
- Uses machine learning to denoise audio containing speech☆29Updated 4 months ago
- Compute useful transcriptions metrics (CER, WER, SER, ...)☆26Updated 10 years ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆12Updated last year
- Audio tokenization, in the fastest way possible!☆45Updated 2 months ago
- Reproduction of the paper SFSRNet: Super-resolution for single-channel Audio Source Separation by me (@arda-num) and @dritx16. Navigate P…☆11Updated 2 years ago
- Feature extractor for DL speech processing.☆65Updated 2 years ago
- Welcome to the Real-Time Voice Activity Detection (VAD) program, powered by Silero-VAD model! 🚀 This program allows you to perform live …☆11Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆84Updated last month
- Deep Learning model for lexical stress detection in spoken English☆26Updated 4 years ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the…☆46Updated last year
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 9 months ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Updated 3 years ago