mlvlab / EfficientViM
Official Implementation (Pytorch) of "EfficientViM: Efficient Vision Mamba with Hidden State Mixer-based State Space Duality"
☆19Updated 2 months ago
Alternatives and similar repositories for EfficientViM:
Users that are interested in EfficientViM are comparing it to the libraries listed below
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆32Updated 3 weeks ago
- ☆24Updated 8 months ago
- Official implementation of CVPR 2024 paper "Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers".☆31Updated 9 months ago
- [NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model☆89Updated 8 months ago
- Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"☆65Updated this week
- ☆38Updated 4 months ago
- (ECCV 2024) Can OOD Object Detectors Learn from Foundation Models?☆24Updated 2 months ago
- Official Implementation of "Towards Open-Vocabulary Semantic Segmentation without Semantic Labels" (NeurIPS 2024)☆40Updated 4 months ago
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆33Updated 5 months ago
- [ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation☆74Updated last week
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆15Updated 3 weeks ago
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆11Updated 4 months ago
- ☆24Updated 5 months ago
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆74Updated 5 months ago
- ☆16Updated last year
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆32Updated 8 months ago
- [NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)☆27Updated this week
- [ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation☆38Updated 4 months ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆12Updated 4 months ago
- A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆41Updated 2 months ago
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆68Updated 8 months ago
- ☆22Updated last month
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆67Updated 4 months ago
- Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""☆11Updated 7 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆32Updated 8 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆40Updated last month
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆33Updated last month
- Official PyTorch implementation of "Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relati…☆31Updated 10 months ago
- Disentangled Pre-training for Human-Object Interaction Detection☆19Updated 3 months ago
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆25Updated last month