gaasher / I-JEPALinks
Implementation of I-JEPA from "Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture"
☆276Updated 10 months ago
Alternatives and similar repositories for I-JEPA
Users that are interested in I-JEPA are comparing it to the libraries listed below
Sorting:
- Fine-tuning "ImageBind One Embedding Space to Bind Them All" with LoRA☆192Updated last year
- Implementation of Block Recurrent Transformer - Pytorch☆221Updated last year
- Official code for "TOAST: Transfer Learning via Attention Steering"☆186Updated 2 years ago
- This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.☆238Updated last year
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆308Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆336Updated 7 months ago
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆417Updated 10 months ago
- ☆209Updated 2 years ago
- Code release for "Dropout Reduces Underfitting"☆315Updated 2 years ago
- ☆189Updated 2 years ago
- Experiments in Joint Embedding Predictive Architectures (JEPAs).☆43Updated last year
- Internet Explorer explores the web in a self-supervised manner to progressively find relevant examples that improve performance on a desi…☆163Updated 2 years ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆90Updated last year
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆103Updated 2 years ago
- Learning from synthetic data - code and models☆325Updated last year
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆232Updated last year
- Code release for "Learning Video Representations from Large Language Models"☆538Updated 2 years ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models☆279Updated last year
- Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"☆354Updated 11 months ago
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆1,040Updated last year
- Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆426Updated 3 weeks ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Updated last year
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆196Updated 9 months ago
- This is the official repository for the LENS (Large Language Models Enhanced to See) system.☆355Updated 4 months ago
- Visualizing representations with diffusion based conditional generative model.☆102Updated 2 years ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆195Updated 3 weeks ago
- Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders☆124Updated 7 months ago
- [ICCV25] Official Implementation of LeGrad☆82Updated last year
- Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and Interpreting Attention in Vision.☆196Updated 2 years ago
- Official code for VisProg (CVPR 2023 Best Paper!)☆751Updated last year