☆13May 9, 2022Updated 3 years ago
Alternatives and similar repositories for Audio-Visual-VAD
Users that are interested in Audio-Visual-VAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 22, 2022Updated 3 years ago
- AlignNet: A Unifying Approach to Audio-Visual Alignment (WACV 2020)☆34Jan 10, 2021Updated 5 years ago
- baseline☆14May 16, 2023Updated 2 years ago
- 基于随机森林和条件随机场的中文韵律预测模型☆28Jul 25, 2024Updated last year
- This repository presents FSD dataset for song deepfake detection.☆25Aug 18, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- A robust video watermarking technique using SVD and DWT in Open CV Python☆18May 5, 2020Updated 5 years ago
- ☆21Feb 15, 2022Updated 4 years ago
- ☆16Sep 7, 2024Updated last year
- Official repo for the STRFNet system appeared in INTERSPEECH2020☆12Mar 6, 2021Updated 5 years ago
- Predict prosody labels for Chinese sentences.☆42Jul 7, 2022Updated 3 years ago
- This repository is webrtc agc module demo.☆12Jan 23, 2019Updated 7 years ago
- ☆12May 30, 2019Updated 6 years ago
- ☆14Oct 12, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆37Jun 22, 2022Updated 3 years ago
- A Novel Deep Video Watermarking Framework with Enhanced Robustness to H.264/AVC Compression☆25Jun 29, 2024Updated last year
- Calculate MFCC/Fbank feature for wav files☆15Nov 21, 2017Updated 8 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆20Apr 10, 2025Updated last year
- ☆67Sep 13, 2022Updated 3 years ago
- DiffUNet☆15Dec 6, 2024Updated last year
- Standard libraries for audio processing, especially STFT and Spherical Harmonics decomposition of a soundfield.☆10Nov 29, 2021Updated 4 years ago
- ☆11May 4, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆23Jul 16, 2025Updated 8 months ago
- TTS-frontend with Bert and CRF/lstm (For Tacotron)☆53Jun 2, 2020Updated 5 years ago
- Implementing LRP (Layer-wise Relevance Propagation) for a sequence-to-sequence model with GRU layers.☆12Sep 8, 2023Updated 2 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- Score Normalization for NIST 2019 Speaker Recognition Evaluation☆10Nov 8, 2019Updated 6 years ago
- Implementing -- Histogram of oriented gradients / Support Vector Machine / TensorFlow☆11Mar 15, 2017Updated 9 years ago
- Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch☆66Aug 31, 2018Updated 7 years ago
- 主要参考李宏毅老师2020年人类语言处理课程资料整理,包括代码和ppt☆34May 25, 2021Updated 4 years ago
- The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".☆12Oct 25, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…☆57Aug 7, 2023Updated 2 years ago
- A toy-like Text-to-Speech for Chinese/Mandarin synthesize, inspired by Tacotron & FastSpeech2 & RefineGAN.☆15May 25, 2022Updated 3 years ago
- A tensorflow based implementation of DeepVoice3 https://arxiv.org/abs/1710.07654☆13Jun 5, 2018Updated 7 years ago
- For a web video that can be downloaded in segments, it is possible to create an overview of the video by downloading only a small amount …☆39May 24, 2024Updated last year
- SinGlow is a part of my Singing voice synthesis system. It can extract features of sound, particularly songs and musics. Then we can use …☆11Oct 9, 2021Updated 4 years ago
- 恋爱指南☆17Aug 3, 2022Updated 3 years ago
- Zero-Mean Convolutions for Level-Invariant Singing Voice Detection☆11Jun 15, 2018Updated 7 years ago