CoEDL / vad-sli-asr
A pipeline to isolate and transcribe one language in mixed-language speech
☆18Updated last year
Related projects: ⓘ
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Prosodic Speech Segmentation with Transformers☆22Updated 6 months ago
- Workflow for forced alignment between languages☆17Updated 7 months ago
- ☆16Updated 3 years ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆26Updated last year
- ☆25Updated 2 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆9Updated last year
- ☆56Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆39Updated 2 months ago
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆17Updated last month
- Clustering-based methods for overlapping diarization☆68Updated 8 months ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆27Updated 4 months ago
- Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.☆30Updated 2 months ago
- Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly es…☆17Updated 3 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆13Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆34Updated last year
- Repository for Accent Recognition (Hackathon @SLT2022)☆20Updated 4 months ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 3 years ago
- wav2vec2 audio classification for prosodic boundary detection and other tasks☆31Updated last year
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆31Updated last year
- A handy dataset of noises for ASR☆19Updated 5 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆64Updated 11 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- A Python-based modular toolbox for building Deep Neural Network models (using PyTorch) for statistical parametric speech synthesis☆23Updated 2 years ago
- A list of papers for child ASR☆24Updated 5 months ago
- Phoneme segmentation using pre-trained speech models☆49Updated last year
- Source code of paper <End-to-End Language Diarization for Bilingual Code-switching Speech>☆18Updated 2 years ago
- ☆11Updated 2 years ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆26Updated last month