fmthoker / SEVERE-BENCHMARK
☆26Updated last year
Alternatives and similar repositories for SEVERE-BENCHMARK:
Users that are interested in SEVERE-BENCHMARK are comparing it to the libraries listed below
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆35Updated last year
- ☆16Updated 4 months ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆17Updated 2 years ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Updated last year
- repo for paper titled: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment (AAAI'24 Oral)☆25Updated 11 months ago
- Compress conventional Vision-Language Pre-training data☆49Updated last year
- Code for Static and Dynamic Concepts for Self-supervised Video Representation Learning.☆10Updated 2 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13Updated last year
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 5 months ago
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Updated last year
- Official Pytorch implementation of 'Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning'? (ICLR2024)☆14Updated last year
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago
- X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024☆11Updated 5 months ago
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)☆36Updated last year
- ☆50Updated 2 years ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Updated last year
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆62Updated 3 years ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆32Updated 2 years ago
- ☆61Updated last year
- Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024☆53Updated 7 months ago
- Official code for the ICLR2023 paper Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection☆43Updated 11 months ago
- [ECCV-2022]Grounding Visual Representations with Texts for Domain Generalization☆31Updated 2 years ago
- Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning☆14Updated last year
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Updated last year
- Official This-Is-My Dataset published in CVPR 2023☆16Updated 9 months ago
- ☆55Updated 2 years ago
- [ICLR 2023] Temporal Alignment Representations with Contrastive Learning☆26Updated 2 years ago
- BEAR: a new BEnchmark on video Action Recognition☆43Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆47Updated 11 months ago
- ☆56Updated this week