facebookresearch/GDT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/GDT)

facebookresearch / GDT

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

☆47

Alternatives and similar repositories for GDT

Users that are interested in GDT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yukimasano / single_img_pretraining
View on GitHub
Code for generating a single image pretraining dataset
☆13Aug 3, 2021Updated 4 years ago
facebookresearch / selavi
View on GitHub
This repo covers the implementation for Labelling unlabelled videos from scratch with multi-modal self-supervision, which learns clusters…
☆118Apr 26, 2021Updated 5 years ago
searobbersduck / MoCo_v3_pytorch
View on GitHub
a pytorch implementation for MoCo V3
☆32Apr 14, 2021Updated 5 years ago
GeWu-Lab / CSOL_TPAMI2021
View on GitHub
The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.
☆29Mar 4, 2022Updated 4 years ago
SMSD75 / MoSiC
View on GitHub
This repo contains the official implementation of ICCV 2025 paper "MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised L…
☆22Sep 12, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / AVID-CMA
View on GitHub
Audio Visual Instance Discrimination with Cross-Modal Agreement
☆133Aug 13, 2021Updated 4 years ago
MIS-DevWorks / FBR
View on GitHub
This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…
☆11Oct 9, 2024Updated last year
LHL3341 / MetaLadder
View on GitHub
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)
☆12Apr 18, 2025Updated last year
princetonvisualai / DomainBiasMitigation
View on GitHub
☆72Apr 19, 2020Updated 6 years ago
chingyaoc / estimating-generalization
View on GitHub
ICML 2020, Estimating Generalization under Distribution Shifts via Domain-Invariant Representations
☆23Jun 30, 2020Updated 6 years ago
oxai / intersectional_gpt2
View on GitHub
☆11Mar 25, 2024Updated 2 years ago
jyapayne / UnixWatchCommandOutput
View on GitHub
Allows the user to input a program and watch the output. This is pretty much the same
☆18Dec 16, 2012Updated 13 years ago
fdreyer / LundNet
View on GitHub
Jet tagging in the Lund plane with graph networks
☆10Dec 16, 2021Updated 4 years ago
YunseokJANG / amc-gan
View on GitHub
☆13Jul 13, 2018Updated 8 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
snap-research / edit3d
View on GitHub
Code for Cross-Modal 3D Shape Generation and Manipulation (ECCV 2022)
☆29May 23, 2023Updated 3 years ago
yukimasano / rck_abm
View on GitHub
Code release to our paper on an agent-based model of the Ramsey-Cass-Koopmans macroeconomic model. In this model, the households imitate …
☆13Jun 3, 2021Updated 5 years ago
Gorilla-Lab-SCUT / OrthDNNs
View on GitHub
Code for OrthDNNs: Orthogonal Deep Neural Networks
☆14Jan 9, 2020Updated 6 years ago
Jeffkang-94 / Mixmatch-pytorch-SSL
View on GitHub
An unofficial PyTorch implementation of MixMatch - A Holistic Approach to Semi-Supervised Learning
☆14Aug 10, 2021Updated 4 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
twuilliam / nodes-info
View on GitHub
Resources on how to use the GPU clusters
☆33Sep 30, 2020Updated 5 years ago
facebookresearch / MT-EQA
View on GitHub
Multi-Target Embodied Question Answering
☆26Jul 17, 2020Updated 6 years ago
matteomancini / neurosnippets
View on GitHub
Repository for NeuroSnippets Code.
☆12Apr 19, 2022Updated 4 years ago
zhihengli-UR / discover_unknown_biases
View on GitHub
Official code of "Discover the Unknown Biased Attribute of an Image Classifier" (ICCV 2021)
☆21Oct 11, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
illidanlab / incomplete-multimodal-kd
View on GitHub
☆11May 16, 2022Updated 4 years ago
TeachingKidsProgramming / TeachingKidsProgramming.Python
View on GitHub
Codenvy Workspace for TKP Python
☆11Feb 21, 2026Updated 5 months ago
TengdaHan / slurm_web
View on GitHub
Website-based resource monitor for Slurm system
☆39Apr 6, 2023Updated 3 years ago
google-research / trecs_image_generation
View on GitHub
☆24Feb 24, 2021Updated 5 years ago
3cology / dinov2_with_attention_extraction
View on GitHub
PyTorch code and models for the DINOv2 self-supervised learning method.
☆12Nov 12, 2023Updated 2 years ago
dkopi / Bitune
View on GitHub
Implementation of Bitune: Bidirectional Instruction-Tuning
☆27Jun 19, 2025Updated last year
pietroastolfi / suave-daino
View on GitHub
Code release for "Semi-supervised learning made simple with self-supervised clustering"
☆63Jun 13, 2023Updated 3 years ago
alvinliu0 / Visual-Sound-Localization-in-the-Wild
View on GitHub
Code for Visual Sound Localization in the Wild by Cross-Modal Interference Erasing (AAAI 2022).
☆29Feb 15, 2022Updated 4 years ago
jiequancui / ResLT
View on GitHub
ResLT: Residual Learning for Long-tailed Recognition (TPAMI 2022)
☆62Nov 7, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
e-bug / cross-modal-ablation
View on GitHub
[EMNLP 2021] Code and data for our paper "Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers…
☆20Jan 17, 2022Updated 4 years ago
MILVLG / rosita
View on GitHub
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆57Jun 13, 2023Updated 3 years ago
vpariza / NeCo
View on GitHub
"Near, far: Patch-ordering enhances vision foundation models' scene understanding": A New SSL Post-Training Approach for Improving DINOv2…
☆33Apr 20, 2025Updated last year
enoche / tGraphAD
View on GitHub
Research sources on graph-based anomaly detection
☆13Nov 29, 2022Updated 3 years ago
JianhongBai / BaCon
View on GitHub
Official implementation of "Towards Distribution-Agnostic Generalized Category Discovery" (NIPS 2023)
☆29Oct 21, 2023Updated 2 years ago
naamahadad / A-Two-Step-Disentanglement-Method
View on GitHub
Model implementation and trained network for "A Two-Step Disentanglement Method" by Naama Hadad, Lior Wolf and Moni Shahar
☆21Mar 21, 2018Updated 8 years ago
kyuyeonpooh / objects-that-sound
View on GitHub
The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.
☆31Jan 29, 2024Updated 2 years ago