rhgao/Deep-MIML-Network

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rhgao/Deep-MIML-Network)

rhgao / Deep-MIML-Network

Learning to Separate Object Sounds by Watching Unlabeled Video (ECCV 2018)

☆50

Alternatives and similar repositories for Deep-MIML-Network

Users that are interested in Deep-MIML-Network are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hangzhaomit / Sound-of-Pixels
View on GitHub
Codebase for ECCV18 "The Sound of Pixels"
☆393Apr 25, 2022Updated 4 years ago
facebookresearch / VisualEchoes
View on GitHub
VisualEchoes Dataset (ECCV 2020)
☆37Aug 31, 2021Updated 4 years ago
rhgao / Im2Flow
View on GitHub
Im2Flow: Motion Hallucination from Static Images for Action Recognition (CVPR 2018)
☆56Sep 4, 2018Updated 7 years ago
facebookresearch / 2.5D-Visual-Sound
View on GitHub
2.5D visual sound
☆121Jul 25, 2023Updated 2 years ago
SheldonTsui / SepStereo_ECCV2020
View on GitHub
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
☆72Oct 20, 2020Updated 5 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
afperezm / acoustic-images-distillation
View on GitHub
Code for the paper: Audio-Visual Model Distillation Using Acoustic Images
☆21Mar 24, 2023Updated 3 years ago
Yu-Wu / Modaily-Aware-Audio-Visual-Video-Parsing
View on GitHub
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
☆24Dec 29, 2021Updated 4 years ago
roudimit / MUSIC_dataset
View on GitHub
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆137Aug 12, 2022Updated 3 years ago
andrewowens / multisensory
View on GitHub
Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
☆225Jul 17, 2019Updated 7 years ago
facebookresearch / FAIR-Play
View on GitHub
2.5D visual sound dataset
☆108Sep 21, 2021Updated 4 years ago
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
GenjiB / ECLIPSE
View on GitHub
☆33Mar 10, 2023Updated 3 years ago
stoneMo / MGN
View on GitHub
Official implementation for MGN
☆20Dec 22, 2022Updated 3 years ago
rhgao / on-demand-learning
View on GitHub
On-Demand Learning for Deep Image Restoration (ICCV 2017)
☆82Aug 5, 2017Updated 8 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated last year
krantiparida / awesome-audio-visual
View on GitHub
A curated list of different papers and datasets in various areas of audio-visual processing
☆775Jan 30, 2024Updated 2 years ago
eborboihuc / SoundNet-tensorflow
View on GitHub
TensorFlow implementation of "SoundNet".
☆144Mar 26, 2018Updated 8 years ago
V-Sense / 360AudioVisual
View on GitHub
This repository contains materials for the paper: Towards generating ambisonics using audio-visual cue for virtual reality
☆13Jul 2, 2019Updated 7 years ago
pedro-morgado / spatialaudiogen
View on GitHub
Spatial Audio Generation
☆117Mar 24, 2023Updated 3 years ago
visipedia / ssw60
View on GitHub
Sapsucker Woods 60 Audiovisual Dataset
☆19Oct 7, 2022Updated 3 years ago
GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
StanfordVL / STGraph
View on GitHub
Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
☆23Mar 4, 2020Updated 6 years ago
channelCS / Audio-Vision
View on GitHub
Implementation and reviews of Audio & Computer vision related papers in python using keras and tensorflow.
☆40Nov 1, 2018Updated 7 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
AmeenAli / VideoMatch
View on GitHub
☆14Jan 5, 2022Updated 4 years ago
ConferencingSpeech / ConferencingSpeech2022
View on GitHub
Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge in Online Conferencing Applications
☆45Apr 11, 2022Updated 4 years ago
PotassiumWings / BUAA-CO-2019
View on GitHub
☆11Jan 15, 2020Updated 6 years ago
DragonLiu1995 / xRIR_code
View on GitHub
[CVPR 2025] Pytorch implementation of the paper "Hearing Anywhere in Any Environment"
☆33Sep 18, 2025Updated 10 months ago
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
danathughes / DeepEmbeddedClustering
View on GitHub
Tensorflow implementation of Deep Embedded Clustering (DEC) for unsupervised learning
☆20Jul 27, 2017Updated 8 years ago
marl / l3embedding
View on GitHub
Learn and L3 embedding from audio/video pairs
☆89Apr 24, 2022Updated 4 years ago
LuoweiZhou / ProcNets-YouCook2
View on GitHub
Source code for paper "Towards Automatic Learning of Procedures from Web Instructional Videos"
☆34Jan 6, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
gchrupala / visually-grounded-speech
View on GitHub
Representations of language in a model of visually grounded speech signal.
☆23Apr 19, 2018Updated 8 years ago
dialogtekgeek / AudioVisualSceneAwareDialog
View on GitHub
☆27May 4, 2020Updated 6 years ago
Chiaraplizz / ARGO1M-What-can-a-cook
View on GitHub
☆11Jul 14, 2023Updated 3 years ago
bill9800 / speech_separation
View on GitHub
Include some core functions and model to handle speech separation
☆156Jun 24, 2021Updated 5 years ago
brian7685 / Multimodal-Clustering-Network
View on GitHub
ICCV 2021
☆34May 11, 2022Updated 4 years ago
StevenWangNPU / Robust_L21_LDA_TPAMI2019
View on GitHub
Towards Robust Discriminative Projections Learning via Non-greedy $\ell_{2,1}$-Norm MinMax
☆10Jul 2, 2020Updated 6 years ago
awentzonline / keras-visual-semantic-embedding
View on GitHub
Mostly for using the trained weights from https://github.com/ryankiros/visual-semantic-embedding in Keras
☆20Apr 23, 2016Updated 10 years ago