Coder-jzq / ICASSP2025-IIICSSLinks
☆10Updated 3 months ago
Alternatives and similar repositories for ICASSP2025-IIICSS
Users that are interested in ICASSP2025-IIICSS are comparing it to the libraries listed below
Sorting:
- ☆13Updated 3 months ago
- ☆18Updated last year
- Towards a general language-audio model for computational paralinguistic tasks☆13Updated 7 months ago
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆37Updated 10 months ago
- ☆81Updated last month
- Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)☆24Updated 4 months ago
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆36Updated 4 months ago
- This repository collects papers related to Speech Tokenizer.☆17Updated 9 months ago
- ☆27Updated last year
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆43Updated last year
- A fully and partially fake speech dataset for evaluation☆12Updated this week
- ☆43Updated last year
- ☆13Updated last year
- Official implementation of the INTERSPEECH 2024 paper: Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detect…☆42Updated 7 months ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆88Updated 8 months ago
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆20Updated 11 months ago
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆24Updated last year
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆40Updated 8 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆163Updated 7 months ago
- ☆37Updated 3 months ago
- An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification☆20Updated 9 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆14Updated 3 months ago
- A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Mult…☆37Updated 9 months ago
- This is official repository of new SOTA diffusion models based method for speech enhancement☆42Updated 11 months ago
- A repo containing download guidance and corresponding scripts of the VoxBlink dataset.☆28Updated last year
- ☆55Updated 10 months ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆55Updated last year
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆41Updated 2 weeks ago
- Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Pro…☆23Updated last year
- ☆24Updated 9 months ago