facebookresearch / GDT

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.
45Updated 3 years ago

Related projects

Alternatives and complementary repositories for GDT