igormq / speech2text
☆11Updated 3 years ago
Related projects: ⓘ
- Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly es…☆17Updated 3 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆13Updated 3 months ago
- ☆10Updated last year
- Baseline kaldi script for UA-SPEECH corpus☆29Updated 3 years ago
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆18Updated 11 months ago
- Speechflow for emotion recognition related information decomposition☆9Updated 3 years ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 3 years ago
- Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.☆19Updated this week
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated last year
- ☆17Updated 6 months ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆38Updated 3 years ago
- A collection of papers related to speech model compression☆24Updated last year
- steps to perform text-based speaker diarization with kaldi toolkit☆11Updated 5 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆60Updated 6 months ago
- A simple command line tool to calculate WER for ASR.☆13Updated last year
- ☆11Updated 2 years ago
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆24Updated last year
- Convert WSJ sphere format to waveform and do data simulation.☆16Updated 4 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆13Updated last year
- Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"☆15Updated 3 years ago
- End-to-end diarization loss☆19Updated 3 years ago
- A list of papers for child ASR☆24Updated 5 months ago
- FEERCI: A Package for Fast non-parametric confidence intervals for Equal Error Rates☆12Updated 6 months ago
- A deep neural network for finding text-independent speaker embedding written in tensorflow and tensorpack☆10Updated 6 years ago
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆36Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 9 months ago
- Lattice combination algorithm to combine inaccurate transcripts with hypothesis lattices☆16Updated 6 months ago
- Python toolkit for speech processing☆64Updated 3 weeks ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆27Updated 2 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 2 years ago