hotfinda / VideoMambaPro
Improving Mamaba performance on Video Understanding task
☆38Updated 5 months ago
Alternatives and similar repositories for VideoMambaPro:
Users that are interested in VideoMambaPro are comparing it to the libraries listed below
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆74Updated 2 months ago
- UniMD: Towards Unifying Moment retrieval and temporal action Detection☆43Updated 8 months ago
- ☆112Updated last year
- 「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆77Updated 9 months ago
- Tracking with Human-Intent Reasoning☆70Updated 4 months ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Updated last year
- ☆38Updated 11 months ago
- ☆17Updated 8 months ago
- [ICCV'2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition☆35Updated last year
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆51Updated 9 months ago
- ☆67Updated 4 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆70Updated 5 months ago
- ☆50Updated 9 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated 5 months ago
- [T-PAMI 2023] Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection☆35Updated last year
- [CVPR2024] UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity☆59Updated 6 months ago
- ☆47Updated 2 years ago
- ☆50Updated last year
- The official repository for paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".☆35Updated last month
- The suite of modeling video with Mamba☆261Updated 10 months ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆116Updated last year
- ☆16Updated last year
- ☆24Updated 9 months ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆54Updated 5 months ago
- CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models☆63Updated last year
- ☆77Updated last year
- [ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"☆54Updated last month
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆81Updated last year
- Video Reasoning Segmentation☆20Updated 4 months ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated last year