jingjing12110 / MixPHM
[CVPR 2023] Pytorch Code of MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
☆16Updated last year
Alternatives and similar repositories for MixPHM
Users that are interested in MixPHM are comparing it to the libraries listed below
Sorting:
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆19Updated 3 months ago
- ☆23Updated 2 years ago
- Official PyTorch code of GroundVQA (CVPR'24)☆60Updated 8 months ago
- [AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.☆41Updated 7 months ago
- [EMNLP'22] Weakly-Supervised Temporal Article Grounding☆14Updated last year
- With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023☆17Updated 11 months ago
- ☆20Updated last year
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆116Updated last year
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆19Updated 7 months ago
- ✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).☆44Updated last month
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆66Updated 3 years ago
- [CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning☆117Updated 4 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆84Updated 8 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- [AAAI2023] Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task (Oral)☆39Updated last year
- Pytorch implementation for Egoinstructor at CVPR 2024☆21Updated 5 months ago
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆51Updated last year
- ☆19Updated 6 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆61Updated 11 months ago
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆16Updated last year
- Learning Situation Hyper-Graphs for Video Question Answering☆20Updated last year
- ☆27Updated last year
- An official implementation for MS-DETR in ACL'23☆16Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆30Updated 7 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆68Updated 6 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆81Updated last year
- Video Graph Transformer for Video Question Answering (ECCV'22)☆47Updated last year
- This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)☆27Updated 10 months ago
- NegCLIP.☆31Updated 2 years ago
- ☆35Updated 10 months ago