☆13May 9, 2022Updated 3 years ago
Alternatives and similar repositories for Audio-Visual-VAD
Users that are interested in Audio-Visual-VAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 22, 2022Updated 3 years ago
- AlignNet: A Unifying Approach to Audio-Visual Alignment (WACV 2020)☆34Jan 10, 2021Updated 5 years ago
- baseline☆14May 16, 2023Updated 2 years ago
- 基于随机森林和条件随机场的中文韵律预测模型☆28Jul 25, 2024Updated last year
- This repository presents FSD dataset for song deepfake detection.☆25Aug 18, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- A robust video watermarking technique using SVD and DWT in Open CV Python☆18May 5, 2020Updated 5 years ago
- ☆21Feb 15, 2022Updated 4 years ago
- ☆16Sep 7, 2024Updated last year
- Official repo for the STRFNet system appeared in INTERSPEECH2020☆12Mar 6, 2021Updated 5 years ago
- Predict prosody labels for Chinese sentences.☆41Jul 7, 2022Updated 3 years ago
- This repository is webrtc agc module demo.☆12Jan 23, 2019Updated 7 years ago
- ☆12May 30, 2019Updated 6 years ago
- ☆14Oct 12, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆37Jun 22, 2022Updated 3 years ago
- A Novel Deep Video Watermarking Framework with Enhanced Robustness to H.264/AVC Compression☆25Jun 29, 2024Updated last year
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Calculate MFCC/Fbank feature for wav files☆15Nov 21, 2017Updated 8 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆20Apr 10, 2025Updated 11 months ago
- ☆67Sep 13, 2022Updated 3 years ago
- Standard libraries for audio processing, especially STFT and Spherical Harmonics decomposition of a soundfield.☆10Nov 29, 2021Updated 4 years ago
- DiffUNet☆15Dec 6, 2024Updated last year
- ☆11May 4, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆22Jul 16, 2025Updated 8 months ago
- TTS-frontend with Bert and CRF/lstm (For Tacotron)☆53Jun 2, 2020Updated 5 years ago
- Implementing LRP (Layer-wise Relevance Propagation) for a sequence-to-sequence model with GRU layers.☆12Sep 8, 2023Updated 2 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- Score Normalization for NIST 2019 Speaker Recognition Evaluation☆10Nov 8, 2019Updated 6 years ago
- Implementing -- Histogram of oriented gradients / Support Vector Machine / TensorFlow☆11Mar 15, 2017Updated 9 years ago
- 主要参考李宏毅老师2020年人类语言处理课程资料整理,包括代码和ppt☆33May 25, 2021Updated 4 years ago
- Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch☆66Aug 31, 2018Updated 7 years ago
- The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".☆12Oct 25, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…☆57Aug 7, 2023Updated 2 years ago
- A toy-like Text-to-Speech for Chinese/Mandarin synthesize, inspired by Tacotron & FastSpeech2 & RefineGAN.☆15May 25, 2022Updated 3 years ago
- A tensorflow based implementation of DeepVoice3 https://arxiv.org/abs/1710.07654☆13Jun 5, 2018Updated 7 years ago
- For a web video that can be downloaded in segments, it is possible to create an overview of the video by downloading only a small amount …☆39May 24, 2024Updated last year
- SinGlow is a part of my Singing voice synthesis system. It can extract features of sound, particularly songs and musics. Then we can use …☆11Oct 9, 2021Updated 4 years ago
- 恋爱指南☆17Aug 3, 2022Updated 3 years ago
- Zero-Mean Convolutions for Level-Invariant Singing Voice Detection☆11Jun 15, 2018Updated 7 years ago