Coder-jzq / ICASSP2025-IIICSSLinks
☆11Updated 10 months ago
Alternatives and similar repositories for ICASSP2025-IIICSS
Users that are interested in ICASSP2025-IIICSS are comparing it to the libraries listed below
Sorting:
- ☆16Updated 10 months ago
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆44Updated last year
- ☆19Updated last year
- ☆130Updated 2 weeks ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆97Updated last year
- This repository collects papers related to Speech Tokenizer.☆17Updated last year
- ☆54Updated last year
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆41Updated 11 months ago
- ☆28Updated last year
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆194Updated last year
- Towards a general language-audio model for computational paralinguistic tasks☆23Updated last year
- ☆27Updated 2 years ago
- A fully and partially fake speech dataset for evaluation☆13Updated 2 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆179Updated 9 months ago
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆27Updated last year
- official implementation of MGA-CLAP (ACM MM 2024)☆28Updated last year
- WildVSR☆21Updated 2 years ago
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆57Updated 4 months ago
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆58Updated 7 months ago
- Visually-Aware Audio Captioning☆43Updated 2 years ago
- ☆13Updated 2 years ago
- This repository is the official implementation of our paper "Improving Generalization for AI-Synthesized Voice Detection", which has been…☆22Updated 3 weeks ago
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆32Updated 3 months ago
- [ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"☆33Updated 9 months ago
- ☆14Updated last year
- This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".☆66Updated last year
- ☆59Updated last year
- TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking☆21Updated 9 months ago
- Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)☆59Updated 10 months ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆55Updated 9 months ago