lucidrains / MaMMUT-pytorchLinks

Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch

☆103

Alternatives and similar repositories for MaMMUT-pytorch

Users that are interested in MaMMUT-pytorch are comparing it to the libraries listed below

Sorting:

LAION-AI / General-GPT
☆65Updated last year
TomerRonen34 / mixed-resolution-vit
☆51Updated last year
facebookresearch / maws
Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496
☆91Updated 3 months ago
lucidrains / zorro-pytorch
Implementation of Zorro, Masked Multimodal Transformer, in Pytorch
☆97Updated last year
patil-suraj / vit-vqgan
JAX implementation ViT-VQGAN
☆83Updated 2 years ago
kirill-vish / Beyond-INet
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
☆101Updated 10 months ago
LAION-AI / laion50BU
Un-*** 50 billions multimodality dataset
☆23Updated 2 years ago
ml-jku / semantic-image-text-alignment
☆24Updated 2 years ago
lucidrains / mirasol-pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
☆89Updated last year
rwightman / imagenet-12k
ImageNet-12k subset of ImageNet-21k (fall11)
☆21Updated 2 years ago
facebookresearch / HierVL
[CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings
☆46Updated last year
lucidrains / discrete-key-value-bottleneck-pytorch
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Updated 2 years ago
mlfoundations / patching
Patching open-vocabulary models by interpolating weights
☆91Updated last year
bfshi / TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
☆189Updated last year
facebookresearch / meru
Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023
☆174Updated last year
enyac-group / supmae
This is a offical PyTorch/GPU implementation of SupMAE.
☆78Updated 2 years ago
allenai / grit_official
Official repository for the General Robust Image Task (GRIT) Benchmark
☆54Updated 2 years ago
LAION-AI / scaling-laws-openclip
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
☆171Updated last month
lucidrains / hourglass-transformer-pytorch
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
☆91Updated 3 years ago
karttikeya / minREV
A simple minimal implementation of Reversible Vision Transformers
☆125Updated last year
rom1504 / CLIP
Contrastive Language-Image Pretraining
☆38Updated last year
OliverRensu / D-iGPT
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…
☆98Updated last year
facebookresearch / VLaMP
Code for “Pretrained Language Models as Visual Planners for Human Assistance”
☆61Updated 2 years ago
OscarXZQ / weight-selection
☆182Updated 10 months ago
mshukor / ViCHA
[BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"
☆55Updated 2 years ago
jmerullo / limber
https://arxiv.org/abs/2209.15162
☆50Updated 2 years ago
philippe-eecs / small-vision
A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.
☆34Updated last year
jeykigung / HiCLIP
☆29Updated 2 years ago
lucidrains / LVMAE-pytorch
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
☆52Updated 8 months ago
noelshin / namedmask
[CVPRW'23] The official PyTorch implementation of NamedMask
☆23Updated 2 years ago