jhcho99 / GSRTR
[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition with Transformers"
☆26Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for GSRTR
- [CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for Grounded Situation Recognition"☆43Updated last year
- [ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation☆35Updated last month
- Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)☆36Updated last year
- The official code for Devil's on the Edges: Selective Quad Attention for Scene Graph Generation, CVPR2023.☆22Updated last year
- ☆20Updated 2 weeks ago
- Activity Grammars for Temporal Action Segmentation (NeurIPS 2023)☆12Updated 5 months ago
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)☆37Updated 10 months ago
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Updated last year
- Scene Graph Generate Zero Shot☆18Updated last year
- The official code for Relational Context Learning for Human-Object Interaction Detection, CVPR2023.☆48Updated last year
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆88Updated 3 weeks ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆55Updated 3 weeks ago
- [NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation☆72Updated 4 months ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆31Updated last year
- ☆14Updated 2 months ago
- Official repository of the "Shatter and Gather: Learning Referring Image Segmentation with Text Supervision (ICCV'23)"☆33Updated 9 months ago
- Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…☆23Updated last year
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆22Updated 5 months ago
- ☆58Updated last year
- Official implementation of TCL (CVPR 2023)☆109Updated last year
- ☆45Updated 6 months ago
- ☆47Updated 2 years ago
- code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720☆54Updated 5 months ago
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆33Updated last year
- ☆57Updated last year
- [CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning☆64Updated 2 years ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆49Updated 3 months ago
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆37Updated 6 months ago
- [CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆41Updated 3 months ago