chengyzhao / TextPSG
☆18Updated last year
Related projects ⓘ
Alternatives and complementary repositories for TextPSG
- DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation☆8Updated 4 months ago
- Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"☆49Updated last year
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆55Updated 3 weeks ago
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆34Updated 2 weeks ago
- Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"☆23Updated last year
- Scene Graph Generate Zero Shot☆18Updated last year
- team Doggeee's solution to Ego4D LTA challenge@CVPRW23'☆11Updated last year
- This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World"…☆44Updated 8 months ago
- ☆30Updated 2 weeks ago
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Updated last year
- [NeurIPS 2022 Spotlight] RLIP: Relational Language-Image Pre-training and a series of other methods to solve HOI detection and Scene Grap…☆72Updated 5 months ago
- Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23☆79Updated 6 months ago
- Official implementation for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding☆42Updated last year
- Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…☆23Updated last year
- [ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training☆118Updated 5 months ago
- mask2former psg☆22Updated last year
- VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation☆20Updated last month
- ☆57Updated last year
- [ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models☆31Updated last month
- [ICCV 2023] HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation☆29Updated 9 months ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13Updated last year
- [ECCV 2024 Best Paper Candidate] Implementation of "Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Vi…☆41Updated last month
- Official code implemtation of paper AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?☆19Updated 2 months ago
- ☆19Updated last year
- [ECCV2024] Learning Video Context as Interleaved Multimodal Sequences☆30Updated last month
- Action Scene Graphs for Long-Form Understanding of Egocentric Videos (CVPR 2024)☆30Updated last month
- [CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"☆51Updated last year
- Egocentric Video Understanding Dataset (EVUD)☆24Updated 4 months ago
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆44Updated last year
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆31Updated last year