OliverRensu / ARM
This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
☆68Updated 7 months ago
Alternatives and similar repositories for ARM:
Users that are interested in ARM are comparing it to the libraries listed below
- [NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model☆89Updated 7 months ago
- [BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition☆72Updated 5 months ago
- ☆55Updated 7 months ago
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆77Updated 5 months ago
- [ICLR 2023] Masked Frequency Modeling for Self-Supervised Visual Pre-Training☆68Updated last year
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆80Updated 10 months ago
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆74Updated 5 months ago
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆60Updated 9 months ago
- ☆32Updated last year
- Project Page for "Multi-Task Dense Prediction via Mixture of Low-Rank Experts"☆60Updated last month
- [CVPR 2024] Official implementation of "Adapters Strike Back"☆34Updated 6 months ago
- Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"☆65Updated this week
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆66Updated 3 months ago
- [CVPR'23] Hard Patches Mining for Masked Image Modeling☆89Updated last year
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆72Updated 4 months ago
- [ICLR2025] Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion☆69Updated 3 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆44Updated 9 months ago
- ☆45Updated 9 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆32Updated 7 months ago
- Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT …☆33Updated last year
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆32Updated 3 weeks ago
- ☆119Updated 7 months ago
- ☆21Updated last year
- ☆56Updated 5 months ago
- ☆25Updated 7 months ago
- 【ICCV 2023】Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning & 【IJCV 2025】Diffusion-Enhanced Test-time Adap…☆60Updated 2 weeks ago
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆10Updated 6 months ago
- ☆77Updated last year
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆31Updated last week
- Official implementation of CVPR 2024 paper "Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers".☆31Updated 9 months ago