IMUDGES / Daily_report_2018
☆7Updated 6 years ago
Alternatives and similar repositories for Daily_report_2018:
Users that are interested in Daily_report_2018 are comparing it to the libraries listed below
- ☆13Updated 9 months ago
- Accepted by TMM 2022☆16Updated 2 years ago
- ☆131Updated 2 years ago
- ☆24Updated 7 months ago
- ☆161Updated 9 months ago
- This package aims at simplifying the download of the AudioCaps dataset.☆33Updated last year
- cross modal background suppression for audio-visual event localization☆35Updated 3 years ago
- ☆33Updated 5 months ago
- PyTorch Implementation of SimulLR☆11Updated 3 years ago
- Conditional Diffusion Probabilistic Model for Speech Enhancement☆231Updated 2 years ago
- ☆40Updated 2 years ago
- ☆17Updated last year
- soundnet and localize sound source☆11Updated 4 years ago
- Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection☆16Updated 8 months ago
- [CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.☆105Updated 10 months ago
- PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)☆27Updated last year
- Code for LAVSS: Location-Guided Audio-Visual Spatial Audio Separation☆12Updated 2 months ago
- ☆18Updated last year
- Official implement of SpeechFormer written in Python (PyTorch).☆78Updated 2 years ago
- ☆47Updated 9 months ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆33Updated last year
- This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".☆56Updated 4 months ago
- MUSIC-AVQA, CVPR2022 (ORAL)☆84Updated 2 years ago
- A ResNet Speaker Recognition&Verification Demo☆26Updated 3 years ago
- Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge☆51Updated last month
- 16k Hz Vocoder (HiFiGAN Codes and Pretrained Models)☆18Updated 2 years ago
- ☆17Updated 4 years ago
- MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019, with Travel awar…☆137Updated 4 years ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆124Updated 2 weeks ago
- It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP…☆49Updated this week