muzairkhattak / ViFi-CLIPView external linksLinks
[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
☆305Apr 3, 2024Updated last year
Alternatives and similar repositories for ViFi-CLIP
Users that are interested in ViFi-CLIP are comparing it to the libraries listed below
Sorting:
- MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)☆30Sep 5, 2023Updated 2 years ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆127Jul 1, 2023Updated 2 years ago
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆50Aug 23, 2024Updated last year
- ☆193Oct 22, 2022Updated 3 years ago
- ☆120Feb 19, 2024Updated last year
- 【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition☆38Apr 27, 2024Updated last year
- 【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models☆155Sep 9, 2024Updated last year
- [CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".☆803Jul 24, 2023Updated 2 years ago
- ☆181Aug 20, 2022Updated 3 years ago
- Validating image classification benchmark results on ViTs and ResNets (v2)☆13Nov 3, 2022Updated 3 years ago
- This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"☆602Dec 6, 2023Updated 2 years ago
- VideoX: a collection of video cross-modal models☆1,061Jun 3, 2024Updated last year
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes 🚀🚀🚀☆37Jan 21, 2025Updated last year
- ☆85May 8, 2023Updated 2 years ago
- [ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without F…☆284Sep 28, 2023Updated 2 years ago
- [CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"☆25Jun 8, 2025Updated 8 months ago
- [MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation"…☆52Nov 14, 2023Updated 2 years ago
- [ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "☆112Aug 3, 2023Updated 2 years ago
- [ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition☆300Sep 17, 2023Updated 2 years ago
- [ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the cap…☆1,488Aug 5, 2025Updated 6 months ago
- ☆42Apr 7, 2024Updated last year
- ☆11Oct 29, 2024Updated last year
- [CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆96Apr 14, 2025Updated 10 months ago
- [⭐ CVPR 2025 Highlight ⭐] Official Implementation of the paper STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing fro…☆29Apr 22, 2025Updated 9 months ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Sep 25, 2023Updated 2 years ago
- [AAAI'25, CVPRW 2024] Official repository of paper titled "Learning to Prompt with Text Only Supervision for Vision-Language Models".☆121Dec 17, 2024Updated last year
- Composed Video Retrieval☆62May 2, 2024Updated last year
- 【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective☆199May 30, 2024Updated last year
- Learnable Weight Initialization for Volumetric Medical Image Segmentation [Elsevier AIM2024]☆22Oct 27, 2024Updated last year
- [NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization☆110Feb 11, 2024Updated 2 years ago
- Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…☆31May 11, 2025Updated 9 months ago
- [MICCAI 2024] Official code repository of paper titled "BAPLe: Backdoor Attacks on Medical Foundation Models using Prompt Learning" accep…☆56Oct 22, 2024Updated last year
- [ECCV2024] Video Foundation Models & Data for Multimodal Understanding☆2,196Dec 15, 2025Updated 2 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆101Apr 30, 2024Updated last year
- PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models☆261Aug 5, 2025Updated 6 months ago
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- This is the offical repository of LLAVIDAL☆23Oct 4, 2025Updated 4 months ago
- Code release for "Learning Video Representations from Large Language Models"☆536Oct 1, 2023Updated 2 years ago