gaasher / I-JEPA
Implementation of I-JEPA from "Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture"
☆259Updated last month
Alternatives and similar repositories for I-JEPA:
Users that are interested in I-JEPA are comparing it to the libraries listed below
- Fine-tuning "ImageBind One Embedding Space to Bind Them All" with LoRA☆178Updated last year
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆951Updated 11 months ago
- Official code for "TOAST: Transfer Learning via Attention Steering"☆188Updated last year
- Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch☆1,104Updated last year
- Experiments in Joint Embedding Predictive Architectures (JEPAs).☆38Updated last year
- Internet Explorer explores the web in a self-supervised manner to progressively find relevant examples that improve performance on a desi …☆163Updated last year
- LLaVA-Interactive-Demo☆362Updated 6 months ago
- Learning from synthetic data - code and models☆310Updated last year
- A concise but complete implementation of CLIP with various experimental improvements from recent papers☆707Updated last year
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time☆446Updated 7 months ago
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆405Updated last month
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆88Updated last year
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆224Updated last year
- ☆599Updated last year
- Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch☆527Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆260Updated 9 months ago
- Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch☆1,230Updated 2 years ago
- This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.☆232Updated 10 months ago
- Official code for VisProg (CVPR 2023 Best Paper!)☆704Updated 5 months ago
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆293Updated last year
- Robust fine-tuning of zero-shot models☆669Updated 2 years ago
- Code release for "Learning Video Representations from Large Language Models"☆507Updated last year
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆96Updated last year
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆308Updated 8 months ago
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…☆1,352Updated 2 months ago
- ☆182Updated last year
- Official PyTorch Implementation of "Learning to Learn with Generative Models of Neural Network Checkpoints"☆337Updated 2 years ago
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆306Updated 8 months ago
- ☆198Updated last year
- DataComp: In search of the next generation of multimodal datasets☆679Updated last year