IMUDGES / Daily_report_2018

☆7

Alternatives and similar repositories for Daily_report_2018:

Users that are interested in Daily_report_2018 are comparing it to the libraries listed below

zexupan / USEV
☆13Updated 9 months ago
Overcautious / ADENet
Accepted by TMM 2022
☆16Updated 2 years ago
Zain-Jiang / Dict-TTS
☆131Updated 2 years ago
wangtianrui / ProgRE
☆24Updated 7 months ago
thuhcsi / SECap
☆161Updated 9 months ago
MorenoLaQuatra / audiocaps-download
This package aims at simplifying the download of the AudioCaps dataset.
☆33Updated last year
marmot-xy / CMBS
cross modal background suppression for audio-visual event localization
☆35Updated 3 years ago
zexupan / MuSE
☆33Updated 5 months ago
NYElegance / SimulLR
PyTorch Implementation of SimulLR
☆11Updated 3 years ago
neillu23 / CDiffuSE
Conditional Diffusion Probabilistic Model for Speech Enhancement
☆231Updated 2 years ago
sony / CLIPSep
☆40Updated 2 years ago
mispchallenge / MISP-ICME-AVSR
☆17Updated last year
liyidi / soundnet_localize_sound_source
soundnet and localize sound source
☆11Updated 4 years ago
Harper812 / FFDConv
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection
☆16Updated 8 months ago
GalaxyCong / HPMDubbing
[CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.
☆105Updated 10 months ago
ms-dot-k / Multi-head-Visual-Audio-Memory
PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)
☆27Updated last year
YYX666660 / LAVSS
Code for LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
☆12Updated 2 months ago
Levent9 / Zero-shot-FaceVC
☆18Updated last year
HappyColor / SpeechFormer
Official implement of SpeechFormer written in Python (PyTorch).
☆78Updated 2 years ago
nii-yamagishilab / PartialSpoof
☆47Updated 9 months ago
joannahong / AV-RelScore
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆33Updated last year
xieyuankun / Codecfake
This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".
☆56Updated 4 months ago
GeWu-Lab / MUSIC-AVQA
MUSIC-AVQA, CVPR2022 (ORAL)
☆84Updated 2 years ago
TaoRuijie / Speaker-Recognition-Demo
A ResNet Speaker Recognition&Verification Demo
☆26Updated 3 years ago
sony / audio-visual-seld-dcase2023
Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge
☆51Updated last month
GalaxyCong / HPMDubbing_Vocoder
16k Hz Vocoder (HiFiGAN Codes and Pretrained Models)
☆18Updated 2 years ago
lessonxmk / head_fusion
☆17Updated 4 years ago
JasonSWFu / MetricGAN
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019, with Travel awar…
☆137Updated 4 years ago
thuhcsi / SpeechCraft
The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.
☆124Updated 2 weeks ago
01Zhangbw / Speech-and-audio-papers-Top-Conference
It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP…
☆49Updated this week