linziyi96 / st-adapterLinks
☆84Updated 2 years ago
Alternatives and similar repositories for st-adapter
Users that are interested in st-adapter are comparing it to the libraries listed below
Sorting:
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆126Updated 2 years ago
- [ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition☆91Updated 10 months ago
- ☆30Updated 2 years ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Updated 2 years ago
- [NeurIPS 2022 Spotlight] RLIP: Relational Language-Image Pre-training and a series of other methods to solve HOI detection and Scene Grap…☆78Updated last year
- Video Test-Time Adaptation for Action Recognition (CVPR 2023)☆50Updated last year
- ☆42Updated last year
- ☆119Updated last year
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆107Updated last year
- [ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption☆101Updated 2 years ago
- ☆38Updated 2 years ago
- ☆62Updated 2 years ago
- CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models☆68Updated last year
- Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023☆55Updated last year
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆61Updated last year
- ☆26Updated 2 years ago
- Official Pytorch implementation of "E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning". (ICCV2023)☆70Updated last year
- ☆49Updated 3 years ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated last year
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆73Updated 2 years ago
- Code for Static and Dynamic Concepts for Self-supervised Video Representation Learning.☆11Updated 3 years ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆15Updated 2 years ago
- [CVPR 2024] TeachCLIP for Text-to-Video Retrieval☆40Updated 6 months ago
- [CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》☆151Updated 2 years ago
- An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"☆93Updated last year
- ☆22Updated 2 years ago
- Official PyTorch code of GroundVQA (CVPR'24)☆64Updated last year
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆73Updated 9 months ago
- ☆104Updated last year
- ☆191Updated 3 years ago