jhcho99 / GSRTR
[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition with Transformers"
☆26Updated 3 years ago
Alternatives and similar repositories for GSRTR:
Users that are interested in GSRTR are comparing it to the libraries listed below
- [CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for Grounded Situation Recognition"☆49Updated 2 years ago
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆104Updated last week
- Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)☆39Updated last year
- The official code for Devil's on the Edges: Selective Quad Attention for Scene Graph Generation, CVPR2023.☆22Updated last year
- [ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation☆40Updated 6 months ago
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆86Updated 11 months ago
- ☆46Updated 11 months ago
- ☆61Updated last year
- Activity Grammars for Temporal Action Segmentation (NeurIPS 2023)☆12Updated 10 months ago
- The official code for Relational Context Learning for Human-Object Interaction Detection, CVPR2023.☆48Updated last year
- ☆23Updated last year
- Official implementation of TCL (CVPR 2023)☆110Updated last year
- ICLR‘24 Offical Implementation of Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization☆72Updated last year
- Future Transformer for Long-term Action Anticipation (CVPR 2022)☆48Updated 2 years ago
- [CVPR 2022] Visual Abductive Reasoning☆122Updated 5 months ago
- Official PyTorch Implementation of Dynamic Hyperpixel Flow, ECCV 2020☆39Updated 2 years ago
- ☆59Updated last year
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)☆36Updated last year
- [ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"☆100Updated 2 years ago
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆27Updated 8 months ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Updated 2 years ago
- Scene Graph Generate Zero Shot☆22Updated 2 years ago
- Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"☆32Updated last year
- ☆30Updated last year
- [ECCV-2022]Grounding Visual Representations with Texts for Domain Generalization☆31Updated 2 years ago
- [CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning☆66Updated 2 years ago
- ☆83Updated 3 years ago
- Official Implementation for paper "Referring Transformer: A One-step Approach to Multi-task Visual Grounding" Neurips 2021☆66Updated 2 years ago
- ☆58Updated last year
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)