Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers, 2022"
☆24Feb 17, 2023Updated 3 years ago
Alternatives and similar repositories for Speaker-VGG-CCT
Users that are interested in Speaker-VGG-CCT are comparing it to the libraries listed below
Sorting:
- ☆13Jan 11, 2024Updated 2 years ago
- [ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech E…☆188May 15, 2024Updated last year
- Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information☆164Nov 27, 2023Updated 2 years ago
- SpeechFormer++ in PyTorch☆50Jul 21, 2023Updated 2 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- FRAME-LEVEL EMOTIONAL STATE ALIGNMENT METHOD FOR SPEECH EMOTION RECOGNITION☆23Dec 22, 2024Updated last year
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- Run Retrieval-based Voice Conversion training and inference with ease.☆11Jan 24, 2025Updated last year
- ATTENTION AGGREGATION NETWORK FOR AUDIO-VISUAL EMOTION RECOGNITION☆13Sep 25, 2023Updated 2 years ago
- Transformer-based model for Speech Emotion Recognition(SER) - implemented by Pytorch☆42Apr 12, 2024Updated last year
- I created some notebooks about different concepts of financial engineering☆10Sep 28, 2025Updated 5 months ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- ☆14Aug 16, 2023Updated 2 years ago
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆27Apr 11, 2024Updated last year
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆53Jun 29, 2024Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Sep 27, 2024Updated last year
- ☆30Jan 22, 2026Updated last month
- Indic-Conformer models for ASR☆21Jul 19, 2024Updated last year
- ☆14Jun 12, 2015Updated 10 years ago
- PolEval 2021 Task 1☆15Jun 28, 2022Updated 3 years ago
- An original package of the dynamic compressive gammachirp filterbank (dcGC-FB)☆14Oct 27, 2024Updated last year
- ☆14Aug 19, 2024Updated last year
- Comprehensive quantitative comparison of lossless and lossy audio codecs☆39Feb 11, 2023Updated 3 years ago
- DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)☆69Jul 8, 2024Updated last year
- Crawling and creating a German language model resource☆18Aug 23, 2022Updated 3 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- ☆17Oct 16, 2018Updated 7 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- SERVER: Multi-modal Speech Emotion Recognition using Transformer-based and Vision-based Embeddings☆15Jan 23, 2024Updated 2 years ago
- Example workflow for our data-centric speech benchmark☆17Jul 6, 2023Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆23Feb 2, 2026Updated last month