danielchyeh / this-is-my
Official This-Is-My Dataset published in CVPR 2023
☆16Updated 8 months ago
Alternatives and similar repositories for this-is-my:
Users that are interested in this-is-my are comparing it to the libraries listed below
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 4 months ago
- Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024☆52Updated 6 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆37Updated 4 months ago
- ☆26Updated last year
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆32Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆26Updated 6 months ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆58Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆57Updated 10 months ago
- VisualGPTScore for visio-linguistic reasoning☆27Updated last year
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆27Updated 8 months ago
- Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples☆23Updated 4 months ago
- Compress conventional Vision-Language Pre-training data☆49Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆46Updated 11 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆39Updated 3 months ago
- ☆46Updated last month
- Official PyTorch code of GroundVQA (CVPR'24)☆59Updated 7 months ago
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆47Updated 7 months ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13Updated last year
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆38Updated last year
- The official repository for paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".☆35Updated 2 months ago
- Composed Video Retrieval☆53Updated 11 months ago
- ☆26Updated last year
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Updated last year
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32Updated last year
- X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024☆11Updated 5 months ago
- Official Pytorch implementation of 'Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning'? (ICLR2024)☆13Updated last year
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Updated last year
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 10 months ago