Yifei-ZHAO96 / STAM-pytorchView external linksLinks
Pytorch implementation of "spectro-temporal attention-based voice activity detection"
☆13Jun 4, 2024Updated last year
Alternatives and similar repositories for STAM-pytorch
Users that are interested in STAM-pytorch are comparing it to the libraries listed below
Sorting:
- Pytorch version of Voice Activity Detection (VAD) based on Deep Learning (https://github.com/filippogiruzzi)☆27Mar 20, 2021Updated 4 years ago
- A LSTM for voice activity detection. In fact, this is a homework which I didn't expected.☆13Dec 3, 2020Updated 5 years ago
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆17Aug 1, 2024Updated last year
- Repo for our pooling approach on the DCASE2018 task4☆15Jul 6, 2023Updated 2 years ago
- Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021☆160Oct 26, 2021Updated 4 years ago
- Lightweight CNN for Robust Voice Activity Detection☆20Jun 30, 2023Updated 2 years ago
- Voice Activity Detection☆29Nov 13, 2017Updated 8 years ago
- Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …☆138Jan 20, 2024Updated 2 years ago
- Dataset created for the Power Line Insulators Inspection Detections☆10Jul 2, 2020Updated 5 years ago
- Photorealism model use RealVisXL v4.0☆12Feb 20, 2024Updated last year
- 在Android上运行人脸表情识别的tflite模型☆12Apr 7, 2021Updated 4 years ago
- This is a repository for a paper accepted at the 2022 IEEE Spoken Language Technology Workshop (SLT 2022)☆41Jul 10, 2024Updated last year
- The codebase for Data-driven general-purpose voice activity detection.☆93Aug 3, 2023Updated 2 years ago
- [WACV 2023] Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization☆13Mar 9, 2024Updated last year
- Change Detection towards Bitemporal Quality Difference via Hierarchical Correlation Distillation☆10Apr 30, 2024Updated last year
- Delving into the Continuous Domain Adaptation (ACM MM22)☆12Jul 10, 2022Updated 3 years ago
- Graph Convolutional Module for Temporal Action Localization in Videos☆10Jul 4, 2020Updated 5 years ago
- These are various scripts to manipulate and/or measure the acoustic properties of speech sounds☆16Oct 18, 2024Updated last year
- ☆10Jan 26, 2021Updated 5 years ago
- ☆12Nov 22, 2022Updated 3 years ago
- Official implementation of "Diffusion models meet image counter-forensics"☆11Jan 22, 2024Updated 2 years ago
- Calculation of the entropy of the batch of images (whole image or patches)☆10Oct 15, 2021Updated 4 years ago
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation☆19Nov 28, 2022Updated 3 years ago
- Simple DNN based Voice Activity Detection (VAD) using Pytorch☆42Feb 8, 2020Updated 6 years ago
- Computer programming - ShanghaiTech☆12Jan 10, 2020Updated 6 years ago
- An imbalanced dataset sampler for PyTorch.☆11Jan 20, 2022Updated 4 years ago
- Source Code for Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models☆12Jun 18, 2025Updated 7 months ago
- Code for calculate DNS_MOS.☆43Dec 18, 2022Updated 3 years ago
- Official Implementation of "Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning" in AAAI2024.☆13Feb 28, 2024Updated last year
- Accepted at IJCAI-2022☆11Sep 3, 2022Updated 3 years ago
- QDL-CMFD: a Quality-independent and Deep Learning-based Copy-Move image Forgery Detection method.☆13Nov 19, 2023Updated 2 years ago
- The codes of our paper "EasyInv: Toward Fast and Better DDIM Inversion"☆14Jun 1, 2025Updated 8 months ago
- 3D Sound Source Localization using Masked Autoencoders☆19Feb 12, 2025Updated last year
- ☆12Oct 6, 2022Updated 3 years ago
- Code for "Speaker Clustering using Dominant Sets", ICPR 2018☆11Nov 28, 2020Updated 5 years ago
- ☆12Jul 11, 2025Updated 7 months ago
- Repository for the CVPR23 paper Re^2TAL☆13Nov 21, 2025Updated 2 months ago
- AAAI-22 paper: Synthetic Disinformation Attacks on Automated Fact Verification Systems☆12Feb 23, 2022Updated 3 years ago
- [InterSpeech'2023] "Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion"☆13Mar 14, 2024Updated last year