taoyang1122 / adapt-image-modelsView external linksLinks
[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
☆300Sep 17, 2023Updated 2 years ago
Alternatives and similar repositories for adapt-image-models
Users that are interested in adapt-image-models are comparing it to the libraries listed below
Sorting:
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Sep 25, 2023Updated 2 years ago
- 【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models☆155Sep 9, 2024Updated last year
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆305Apr 3, 2024Updated last year
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆127Jul 1, 2023Updated 2 years ago
- ☆181Aug 20, 2022Updated 3 years ago
- [ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video☆20Jul 29, 2024Updated last year
- ☆42Apr 7, 2024Updated last year
- ❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119☆1,214Sep 2, 2023Updated 2 years ago
- Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".☆77Mar 7, 2024Updated last year
- [ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models☆347May 27, 2024Updated last year
- ☆110Dec 23, 2022Updated 3 years ago
- ☆120Feb 19, 2024Updated last year
- [ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition☆97Jan 14, 2025Updated last year
- This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"☆602Dec 6, 2023Updated 2 years ago
- [CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking☆750Oct 8, 2024Updated last year
- MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)☆30Sep 5, 2023Updated 2 years ago
- 【ICCV'2023】What Can Simple Arithmetic Operations Do for Temporal Modeling?☆73Jan 26, 2024Updated 2 years ago
- Official implementation of PCS in essay "Prompt Vision Transformer for Domain Generalization"☆50Jan 29, 2023Updated 3 years ago
- [NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training☆1,675Dec 8, 2023Updated 2 years ago
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆107Jan 28, 2024Updated 2 years ago
- Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)☆2,173May 20, 2024Updated last year
- [NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"☆379Sep 16, 2022Updated 3 years ago
- [ICCV 2023 & AAAI 2023] Binary Adapters & FacT, [Tech report] Convpass☆198Aug 1, 2023Updated 2 years ago
- Multi-modality pre-training☆507May 8, 2024Updated last year
- Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"☆259May 3, 2024Updated last year
- Code release for "Learning Video Representations from Large Language Models"☆536Oct 1, 2023Updated 2 years ago
- This is an official implementation for "Video Swin Transformers".☆1,630Mar 8, 2023Updated 2 years ago
- ☆16Aug 5, 2022Updated 3 years ago
- Turning to Video for Transcript Sorting☆49Aug 27, 2023Updated 2 years ago
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Oct 13, 2022Updated 3 years ago
- [ICCV2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition☆41Nov 29, 2023Updated 2 years ago
- The first unofficial implementation of CLIP4Caption: CLIP for Video Caption (ACMMM 2021)☆15Jan 2, 2023Updated 3 years ago
- [ICCV 2021] MGSampler: An Explainable Sampling Strategy for Video Action Recognition☆51Jul 9, 2022Updated 3 years ago
- [ECCV2024] Video Foundation Models & Data for Multimodal Understanding☆2,196Dec 15, 2025Updated last month
- ☆937May 15, 2024Updated last year
- Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)☆29Jan 1, 2024Updated 2 years ago
- Source code of our MM'22 paper Partially Relevant Video Retrieval☆55Nov 4, 2024Updated last year
- The first work for cross-domain open-vocabulary action recognition with a benchmark☆20May 27, 2024Updated last year
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Oct 5, 2023Updated 2 years ago