yeruoforever / Awesome-Mamba
Awsome works based on SSM and Mamba
☆17Updated 10 months ago
Alternatives and similar repositories for Awesome-Mamba:
Users that are interested in Awesome-Mamba are comparing it to the libraries listed below
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆31Updated 3 months ago
- ☆54Updated last year
- ☆46Updated 11 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆43Updated 10 months ago
- [CVPR 2025] Assessing and Learning Alignment of Unimodal Vision and Language Models☆24Updated this week
- [CVPR2024] ModaVerse: Efficiently Transforming Modalities with LLMs☆29Updated 7 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆21Updated 6 months ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆49Updated 3 months ago
- ☆33Updated 7 months ago
- ☆56Updated this week
- [NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model☆90Updated 8 months ago
- [CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities☆100Updated 11 months ago
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆36Updated last week
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆82Updated 6 months ago
- [NIPS2023]Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector☆36Updated 11 months ago
- [ICCV 2023] CLR: Channel-wise Lightweight Reprogramming for Continual Learning☆29Updated 8 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆80Updated 11 months ago
- State Space Models☆64Updated 10 months ago
- PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆16Updated 2 weeks ago
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆69Updated 8 months ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Updated 4 months ago
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆47Updated 2 months ago
- Official Implementation of Attentive Mask CLIP (ICCV2023, https://arxiv.org/abs/2212.08653)☆29Updated 9 months ago
- A repository for DenseSSMs☆87Updated 10 months ago
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆26Updated last month
- CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation☆68Updated 6 months ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆69Updated 4 months ago
- Official PyTorch implementation for "Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels"☆88Updated last year
- The official implementation of ADDP (ICLR 2024)☆12Updated 11 months ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆23Updated 4 months ago