jhcho99 / GSRTR
[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition with Transformers"
☆26Updated 3 years ago
Alternatives and similar repositories for GSRTR:
Users that are interested in GSRTR are comparing it to the libraries listed below
- Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)☆39Updated last year
- The official code for Devil's on the Edges: Selective Quad Attention for Scene Graph Generation, CVPR2023.☆22Updated last year
- [CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for Grounded Situation Recognition"☆48Updated last year
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆104Updated last week
- [ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation☆39Updated 6 months ago
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆26Updated 7 months ago
- ☆46Updated 11 months ago
- Official implementation of TCL (CVPR 2023)☆109Updated last year
- Scene Graph Generate Zero Shot☆22Updated last year
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆35Updated last year
- [ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"☆100Updated last year
- ICLR‘24 Offical Implementation of Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization☆72Updated last year
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Updated last year
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆18Updated last month
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆84Updated 10 months ago
- ☆59Updated last year
- A Unified Framework for Video-Language Understanding☆57Updated last year
- ☆11Updated 8 months ago
- Official Implementation for paper "Referring Transformer: A One-step Approach to Multi-task Visual Grounding" Neurips 2021☆66Updated 2 years ago
- The official code for Relational Context Learning for Human-Object Interaction Detection, CVPR2023.☆48Updated last year
- Activity Grammars for Temporal Action Segmentation (NeurIPS 2023)☆12Updated 9 months ago
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆38Updated last year
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 4 months ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆53Updated 2 years ago
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆45Updated 8 months ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆56Updated last year
- [CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning☆66Updated 2 years ago
- [NeurIPS 2024] Official PyTorch Implementation of PartCLIPSeg☆45Updated 3 months ago
- [NeurIPS 2022 Spotlight] RLIP: Relational Language-Image Pre-training and a series of other methods to solve HOI detection and Scene Grap…☆73Updated 10 months ago
- VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation☆25Updated 6 months ago