Yuanbo2020/Audio-Visual-VAD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Yuanbo2020/Audio-Visual-VAD)

Yuanbo2020 / Audio-Visual-VAD

☆13

Alternatives and similar repositories for Audio-Visual-VAD

Users that are interested in Audio-Visual-VAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

haoheliu / ontology-aware-audio-tagging
View on GitHub
☆14Nov 22, 2022Updated 3 years ago
zfang399 / AlignNet
View on GitHub
AlignNet: A Unifying Approach to Audio-Visual Alignment (WACV 2020)
☆34Jan 10, 2021Updated 5 years ago
Riroaki / Chinese-Rhythm-Predictor
View on GitHub
基于随机森林和条件随机场的中文韵律预测模型
☆28Jul 25, 2024Updated 2 years ago
xieyuankun / FSD-Dataset
View on GitHub
This repository presents FSD dataset for song deepfake detection.
☆24Aug 18, 2025Updated 11 months ago
mgjinnn / wm_baseline
View on GitHub
baseline
☆15May 16, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JuanFMontesinos / Solos
View on GitHub
Solos: A Dataset for Audio-Visual Music Analysis
☆24Feb 17, 2023Updated 3 years ago
zcxu-eric / Ego4d_TalkNet_ASD
View on GitHub
☆21Feb 15, 2022Updated 4 years ago
SSYSteve / GRATIS
View on GitHub
☆16Sep 7, 2024Updated last year
raymondxyy / strfnet-IS2020
View on GitHub
Official repo for the STRFNet system appeared in INTERSPEECH2020
☆12Mar 6, 2021Updated 5 years ago
peck94 / pytorch_shearlets
View on GitHub
A PyTorch implementation of the shearlet transform.
☆19Oct 9, 2025Updated 9 months ago
Zeqiang-Lai / Prosody_Prediction
View on GitHub
Predict prosody labels for Chinese sentences.
☆42Jul 7, 2022Updated 4 years ago
jagger2048 / WebRtc_AGC1
View on GitHub
This repository is webrtc agc module demo.
☆12Jan 23, 2019Updated 7 years ago
m-kazuki / AuxIVA
View on GitHub
☆12May 30, 2019Updated 7 years ago
Okrio / deepvqe
View on GitHub
☆14Oct 12, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
l3das / L3DAS21
View on GitHub
☆37Jun 22, 2022Updated 4 years ago
sdas-ghub98 / ML_Watermarking
View on GitHub
A robust video watermarking technique using SVD and DWT in Open CV Python
☆18May 5, 2020Updated 6 years ago
powerycy / BossHunter
View on GitHub
Smart job hunting Agent - AI-powered automation from scraping to delivery
☆25Jun 29, 2026Updated 3 weeks ago
EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
ScarletMercy / chcode
View on GitHub
Terminal-based AI coding agent — 7000+ lines, 14 tools, session persistence, git-aware workflow. Built with LangChain + Typer + Rich.
☆18Jul 16, 2026Updated last week
adam2go / mfcc
View on GitHub
Calculate MFCC/Fbank feature for wav files
☆15Nov 21, 2017Updated 8 years ago
czy1999 / ChronoQA
View on GitHub
ChronoQA - A Question Answering Dataset for Temporal-Sensitive Retrieval-Augmented Generation
☆20Dec 26, 2025Updated 7 months ago
abdfahim / audioprocessing
View on GitHub
Standard libraries for audio processing, especially STFT and Spherical Harmonics decomposition of a soundfield.
☆10Nov 29, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Livefull / SphereDiar
View on GitHub
☆11May 4, 2020Updated 6 years ago
MrCrims / LoT-Pass-Long-term-robust-Image-Watermarking-for-Image-to-Video-Generation
View on GitHub
Official implementation of "I2VWM: Robust Watermarking for Image to Video Generation"
☆15Jun 23, 2026Updated last month
JusperLee / TFACM
View on GitHub
☆23Jul 16, 2025Updated last year
noiseux1523 / NIST-SRE-2019
View on GitHub
Score Normalization for NIST 2019 Speaker Recognition Evaluation
☆10Nov 8, 2019Updated 6 years ago
cprakashagr / hog-svm-tf
View on GitHub
Implementing -- Histogram of oriented gradients / Support Vector Machine / TensorFlow
☆11Mar 15, 2017Updated 9 years ago
Liu-Feng-deeplearning / TTS-frontend
View on GitHub
TTS-frontend with Bert and CRF/lstm (For Tacotron)
☆53Jun 2, 2020Updated 6 years ago
a-nagrani / VoxSRC2020
View on GitHub
Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020
☆43Jul 17, 2020Updated 6 years ago
Sara-mibo / LRP_EncoderDecoder_GRU
View on GitHub
Implementing LRP (Layer-wise Relevance Propagation) for a sequence-to-sequence model with GRU layers.
☆12Sep 8, 2023Updated 2 years ago
wolfparticle / lee-nlp_asr2020
View on GitHub
主要参考李宏毅老师2020年人类语言处理课程资料整理，包括代码和ppt
☆34May 25, 2021Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
shincling / discreteSeparation
View on GitHub
The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".
☆12Oct 25, 2021Updated 4 years ago
dharwath / DAVEnet-pytorch
View on GitHub
Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch
☆66Aug 31, 2018Updated 7 years ago
hcy71o / SNAC
View on GitHub
Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…
☆57Aug 7, 2023Updated 2 years ago
Kahsolt / TransTacoS-RetuneGAN
View on GitHub
A toy-like Text-to-Speech for Chinese/Mandarin synthesize, inspired by Tacotron & FastSpeech2 & RefineGAN.
☆15May 25, 2022Updated 4 years ago
yulinsysu / REVMark
View on GitHub
A Novel Deep Video Watermarking Framework with Enhanced Robustness to H.264/AVC Compression
☆25Jun 29, 2024Updated 2 years ago
TanUkkii007 / deepvoice3-tensorflow
View on GitHub
A tensorflow based implementation of DeepVoice3 https://arxiv.org/abs/1710.07654
☆13Jun 5, 2018Updated 8 years ago
zhaoyi2 / CVTE_chain_model_finetune
View on GitHub
finetune the chain model based on cvte open source model without traing any GMM for frame alignment
☆12Aug 6, 2020Updated 5 years ago