PranavPutsa1006 / Speaker-Diarization
Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python
β16Updated last year
Related projects: β
- Audio processing using deep neural networks. Speaker identification using voice embeddings.β12Updated last year
- πΉ pyannote + π notebook = pyannotebookβ25Updated last year
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Spβ¦β12Updated last year
- Speaker change detection using SincNet and an LSTM/Transformerβ39Updated 2 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.β64Updated 11 months ago
- speech recognition using Kaldi frameworkβ12Updated 4 years ago
- Zero-Shot Foreign Accent Conversion without a Native Referenceβ27Updated 4 months ago
- Similarity Learning applied to Speaker Verification and Semantic Textual Similarityβ12Updated 4 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.β13Updated last year
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.β12Updated last year
- Audio tokenization, in the fastest way possible!β45Updated 3 weeks ago
- π― Speech Recognition Challenge by Speech Lab - IIT Madrasβ11Updated 3 years ago
- A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" modelsβ65Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription usingβ¦β27Updated last year
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languagesβ13Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ79Updated 5 months ago
- Keras(Tensorflow) implementations of Automatic Speech Recognitionβ22Updated 2 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.β27Updated 7 months ago
- β56Updated last year
- OCTRA is a web-application for the orthographic transcription of audio files.β35Updated last week
- Zero-shot Audio Classification using Whisperβ74Updated last year
- Speakerbox: Fine-tune Audio Transformers for speaker identification.β51Updated 6 months ago
- Phoneme prediction from speech mel-spectrograms using RNN.β13Updated 5 years ago
- π₯ π€ The largest clinical study in the world to collect voice data labeled with health information (N>6,000 participants, 48 utterancesβ¦β28Updated 3 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawlerβ24Updated 3 years ago
- Compute useful transcriptions metrics (CER, WER, SER, ...)β26Updated 9 years ago
- Machine learning experiment to perform gender classification from raw audio.β23Updated 6 years ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to sβ¦β27Updated last year
- β23Updated last year
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker β¦β20Updated this week