BirgerMoell / tmh
☆18Updated last year
Related projects: ⓘ
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 2 years ago
- ☆49Updated this week
- ☆37Updated 3 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- asr2k☆48Updated 3 months ago
- ☆45Updated 3 years ago
- ☆74Updated 2 years ago
- Official code for Wav2Seq☆95Updated 2 years ago
- ☆69Updated this week
- The official repository for Audio ALBERT☆64Updated 2 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆42Updated last year
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- EMO-SUPERB submission☆27Updated 2 weeks ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 3 years ago
- Emotion detection in audio utilising self-supervised representations trained with Contrastive Predictive Coding (CPC).☆41Updated 2 years ago
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆36Updated last year
- A unified dataset of multilingual emotional human utterances☆22Updated 2 years ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆80Updated 11 months ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated last year
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Updated 2 years ago
- 56 language, 1 model Multilingual ASR☆23Updated 3 years ago
- PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis☆56Updated 2 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆122Updated 2 years ago
- Phoneme segmentation using pre-trained speech models☆49Updated last year
- [INTERSPEECH'2022] Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning☆78Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64Updated last year
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆23Updated 2 weeks ago
- ☆56Updated last year
- Making Espnet easier to use☆51Updated 3 years ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆31Updated last year