EPFL-VILAB/MultiMAE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/EPFL-VILAB/MultiMAE)

EPFL-VILAB / MultiMAE

MultiMAE: Multi-modal Multi-task Masked Autoencoders, ECCV 2022

☆632

Alternatives and similar repositories for MultiMAE

Users that are interested in MultiMAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / omnivore
View on GitHub
Omnivore: A Single Model for Many Visual Modalities
☆573Nov 12, 2022Updated 3 years ago
facebookresearch / mae
View on GitHub
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
☆8,366Jul 23, 2024Updated last year
Alpha-VL / ConvMAE
View on GitHub
ConvMAE: Masked Convolution Meets Masked Autoencoders
☆530Mar 14, 2023Updated 3 years ago
hustvl / MIMDet
View on GitHub
[ICCV 2023] You Only Look at One Partial Sequence
☆343Oct 21, 2023Updated 2 years ago
facebookresearch / msn
View on GitHub
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
☆463May 9, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Sense-GVT / DeCLIP
View on GitHub
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
☆678Sep 19, 2022Updated 3 years ago
microsoft / SimMIM
View on GitHub
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
☆1,047Sep 29, 2022Updated 3 years ago
facebookresearch / long_seq_mae
View on GitHub
code release of research paper "Exploring Long-Sequence Masked Autoencoders"
☆100Oct 14, 2022Updated 3 years ago
facebookresearch / SLIP
View on GitHub
Code release for SLIP Self-supervision meets Language-Image Pre-training
☆792Feb 9, 2023Updated 3 years ago
ucasligang / awesome-MIM
View on GitHub
Reading list for research topics in Masked Image Modeling
☆333Dec 3, 2024Updated last year
microsoft / X-Decoder
View on GitHub
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
☆1,346Oct 5, 2023Updated 2 years ago
microsoft / GLIP
View on GitHub
Grounded Language-Image Pre-training
☆2,605Jan 24, 2024Updated 2 years ago
bytedance / ibot
View on GitHub
iBOT : Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
☆776Apr 14, 2022Updated 4 years ago
young-geng / m3ae_public
View on GitHub
Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation
☆110Feb 26, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
NVlabs / GroupViT
View on GitHub
Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
☆788May 10, 2022Updated 4 years ago
facebookresearch / Detic
View on GitHub
Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
☆2,007Mar 21, 2024Updated 2 years ago
MCG-NJU / VideoMAE
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,775Dec 8, 2023Updated 2 years ago
LAION-AI / laion50BU
View on GitHub
Un-*** 50 billions multimodality dataset
☆24Sep 14, 2022Updated 3 years ago
lxtGH / Video-K-Net
View on GitHub
[CVPR-2022 (oral)]-Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
☆157Aug 19, 2023Updated 2 years ago
amazon-science / bigdetection
View on GitHub
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
☆399Oct 23, 2024Updated last year
naver-ai / vidt
View on GitHub
☆319Oct 26, 2022Updated 3 years ago
lxtGH / CAE
View on GitHub
This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"
☆199Jan 11, 2023Updated 3 years ago
ShoufaChen / AdaptFormer
View on GitHub
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"
☆388Sep 16, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
czczup / ViT-Adapter
View on GitHub
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
☆1,503Jun 3, 2025Updated last year
google-research / pix2seq
View on GitHub
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
☆945Nov 7, 2023Updated 2 years ago
CASIA-LMC-Lab / Obj2Seq
View on GitHub
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)
☆85Nov 2, 2022Updated 3 years ago
TencentARC / ConMIM
View on GitHub
Official codes for ConMIM (ICLR 2023)
☆58Feb 8, 2023Updated 3 years ago
prismformore / Multi-Task-Transformer
View on GitHub
Code of ICLR2023 paper "TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding" and ECCV2022 paper "Inverted Py…
☆326Apr 24, 2024Updated 2 years ago
amirbar / visual_prompting
View on GitHub
Official implementation and data release of the paper "Visual Prompting via Image Inpainting".
☆319Aug 7, 2023Updated 2 years ago
ashkamath / mdetr
View on GitHub
☆1,050Oct 3, 2022Updated 3 years ago
facebookresearch / ConvNeXt
View on GitHub
Code release for ConvNeXt model
☆6,414Jan 8, 2023Updated 3 years ago
UCSC-VLAA / CLIPA
View on GitHub
[NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"
☆321Jun 3, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
yuhangzang / OV-DETR
View on GitHub
[Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)
☆240Aug 3, 2022Updated 3 years ago
LightDXY / BootMAE
View on GitHub
ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining
☆97Nov 2, 2022Updated 3 years ago
liuzhuang13 / anytime
View on GitHub
Anytime Dense Prediction with Confidence Adaptivity (ICLR 2022)
☆51Aug 23, 2024Updated last year
zihangJiang / TokenLabeling
View on GitHub
Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
☆436Sep 5, 2023Updated 2 years ago
facebookresearch / mae_st
View on GitHub
Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"
☆371Jan 12, 2026Updated 6 months ago
raoyongming / DenseCLIP
View on GitHub
[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
☆550Sep 15, 2023Updated 2 years ago
baaivision / EVA
View on GitHub
EVA Series: Visual Representation Fantasies from BAAI
☆2,684Aug 1, 2024Updated last year