OliverRensu / ARM
This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
☆74Updated 10 months ago
Alternatives and similar repositories for ARM:
Users that are interested in ARM are comparing it to the libraries listed below
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆81Updated last year
- [NeurIPS2024 Spotlight] The official implementation of MambaTree: Tree Topology is All You Need in State Space Model☆92Updated 10 months ago
- [BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition☆75Updated 2 weeks ago
- ☆65Updated last month
- [ICLR 2023] Masked Frequency Modeling for Self-Supervised Visual Pre-Training☆75Updated last year
- Project Page for "Multi-Task Dense Prediction via Mixture of Low-Rank Experts"☆71Updated 3 months ago
- [CVPR'23 & TPAMI'25] Hard Patches Mining for Masked Image Modeling☆93Updated last week
- ☆86Updated 2 years ago
- Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders☆107Updated last week
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆98Updated 11 months ago
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆85Updated 3 weeks ago
- Adapters Strike Back (CVPR 2024)☆35Updated 8 months ago
- Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT …☆35Updated last year
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆60Updated 11 months ago
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆80Updated 3 weeks ago
- GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model [CVPR -2025]☆91Updated last month
- Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference☆154Updated 6 months ago
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆40Updated this week
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆84Updated 8 months ago
- [CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"☆105Updated last year
- ☆33Updated last year
- ☆57Updated 8 months ago
- Code for the paper "Compositional Entailment Learning for Hyperbolic Vision-Language Models".☆57Updated 2 months ago
- [CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities☆99Updated last year
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆70Updated 7 months ago
- ☆130Updated 10 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆129Updated 4 months ago
- Text-Image Alignment for Diffusion-based Perception (TADP) - CVPR 2024☆31Updated 7 months ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆56Updated last year
- Official repository of InLine attention (NeurIPS 2024)☆45Updated 4 months ago