Yuanbo2020 / Audio-Visual-VADView external linksLinks
☆13May 9, 2022Updated 3 years ago
Alternatives and similar repositories for Audio-Visual-VAD
Users that are interested in Audio-Visual-VAD are comparing it to the libraries listed below
Sorting:
- ☆21Feb 15, 2022Updated 3 years ago
- This repository presents FSD dataset for song deepfake detection.☆25Aug 18, 2025Updated 5 months ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 2 years ago
- Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch☆65Aug 31, 2018Updated 7 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- 基于随机森林和条件随机场的中文韵律预测模型☆28Jul 25, 2024Updated last year
- ☆67Sep 13, 2022Updated 3 years ago
- AlignNet: A Unifying Approach to Audio-Visual Alignment (WACV 2020)☆34Jan 10, 2021Updated 5 years ago
- Standard libraries for audio processing, especially STFT and Spherical Harmonics decomposition of a soundfield.☆10Nov 29, 2021Updated 4 years ago
- ☆37Jun 22, 2022Updated 3 years ago
- Implementing LRP (Layer-wise Relevance Propagation) for a sequence-to-sequence model with GRU layers.☆12Sep 8, 2023Updated 2 years ago
- 主要参考李宏毅老师2020年人类语言处理课程资料整理,包括代码和ppt☆33May 25, 2021Updated 4 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…☆11Nov 21, 2023Updated 2 years ago
- ☆13Nov 22, 2022Updated 3 years ago
- ☆11May 4, 2020Updated 5 years ago
- Implementing -- Histogram of oriented gradients / Support Vector Machine / TensorFlow☆11Mar 15, 2017Updated 8 years ago
- The official repo/implementation of the paper "Training a Singing Transcription Model Using Connectionist Temporal Classification Loss an…☆12Mar 25, 2025Updated 10 months ago
- The code repository for "Multi-layer Rehearsal Feature Augmentation for Class-Incremental Learning" (ICML24)☆11Jun 7, 2024Updated last year
- The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".☆12Oct 25, 2021Updated 4 years ago
- awesome-audio-visual-robustness☆11Jan 27, 2024Updated 2 years ago
- Predict prosody labels for Chinese sentences.☆41Jul 7, 2022Updated 3 years ago
- ☆10Feb 19, 2021Updated 4 years ago
- Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments☆11Nov 29, 2021Updated 4 years ago
- ☆11May 30, 2019Updated 6 years ago
- 恋爱指南☆17Aug 3, 2022Updated 3 years ago
- A CUDA powered audio decoding framework for FLAC.☆11May 22, 2018Updated 7 years ago
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated last year
- ☆10Sep 6, 2020Updated 5 years ago
- ☆13Oct 25, 2024Updated last year
- ☆49Nov 24, 2022Updated 3 years ago
- ☆15May 16, 2024Updated last year
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- A collection of Embodied AI datasets.☆18Jun 10, 2025Updated 8 months ago
- Baseline for REVERIE-Challenge using HOP☆10Jul 4, 2022Updated 3 years ago
- My implementation (PyTorch) for the paper SST: Single-Stream Temporal Action Proposals (http://vision.stanford.edu/pdf/buch2017cvpr.pdf).☆10Dec 8, 2022Updated 3 years ago
- SinGlow is a part of my Singing voice synthesis system. It can extract features of sound, particularly songs and musics. Then we can use …☆11Oct 9, 2021Updated 4 years ago
- ☆10Feb 24, 2022Updated 3 years ago
- ☆11Nov 5, 2025Updated 3 months ago