Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
☆19Mar 9, 2024Updated last year
Alternatives and similar repositories for CoVGT
Users that are interested in CoVGT are comparing it to the libraries listed below
Sorting:
- ☆12Dec 15, 2023Updated 2 years ago
- Video Graph Transformer for Video Question Answering (ECCV'22)☆49Jun 8, 2023Updated 2 years ago
- Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)☆34Sep 17, 2022Updated 3 years ago
- ☆36Dec 20, 2023Updated 2 years ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆83Jul 1, 2024Updated last year
- Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (…☆51May 29, 2024Updated last year
- ☆13Aug 14, 2022Updated 3 years ago
- Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation (TIP 2024, ACM MM 2023)☆19Mar 13, 2024Updated last year
- [EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering☆18Oct 9, 2024Updated last year
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆44Mar 28, 2024Updated last year
- Code for CVPR 2023 paper "SViTT: Temporal Learning of Sparse Video-Text Transformers"☆20Jun 16, 2023Updated 2 years ago
- Learning Situation Hyper-Graphs for Video Question Answering☆22Feb 16, 2024Updated 2 years ago
- This repo contains code for Invariant Grounding for Video Question Answering☆27Mar 2, 2023Updated 3 years ago
- PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)☆27Oct 13, 2022Updated 3 years ago
- [AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning☆68Feb 16, 2024Updated 2 years ago
- Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022☆11Apr 13, 2025Updated 10 months ago
- A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo☆34Aug 12, 2024Updated last year
- Retrieval Augmented Generation, but no servers involved. Backed by S3☆12Nov 3, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 3 months ago
- A large-scale training and benchmarking framework for rPPG.☆10Nov 26, 2024Updated last year
- [WACV 2026] PyTorch code for 4D-Animal.☆27Nov 18, 2025Updated 3 months ago
- Placeholder☆10Jul 17, 2023Updated 2 years ago
- [ICML2023] Instant Soup Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models. Ajay Jaiswal, Shiwei Liu, Ti…☆11Nov 28, 2023Updated 2 years ago
- A guide to structured generation using constrained decoding☆14Jun 9, 2024Updated last year
- SSL Video Representation Learning project☆14Jul 8, 2025Updated 7 months ago
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- Phoshell: a Forth inspired, extremely lightweight, stack machine shell, implementable in _ALL_ known programming languages.☆10Nov 21, 2020Updated 5 years ago
- Pytorch implementation of Count-ception and custom CNN counting models for Kaggle Sea Lion Count challenge☆10Jun 30, 2017Updated 8 years ago
- Risky Object Localization (ROL) in a Driving Scene Dataset☆15Dec 24, 2023Updated 2 years ago
- A drag-and-drop-enabled, responsive, envelope graph that allows to shape a wave with attack, decay, sustain and release☆11Jan 5, 2023Updated 3 years ago
- Learning Precise Affordances from Egocentric Videos for Robotic Manipulation (ICCV 2025)☆17Jan 30, 2026Updated last month
- 湖南大学博士毕业论文Latex模版☆12May 29, 2019Updated 6 years ago
- 🔊Replicate Cog'ified MMAudio🎵☆18Jul 10, 2025Updated 7 months ago
- Fast Contextual Scene Graph Generation with Unbiased Context Augmentation☆12Aug 7, 2023Updated 2 years ago
- (IJCAI 2023) Sph2Pob: Boosting Object Detection on Spherical Images with Planar Oriented Boxes Methods☆13Aug 23, 2023Updated 2 years ago
- A collection of papers tackling automatic fact-checking (particularly of AI-generated content)☆14Nov 3, 2023Updated 2 years ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆184Aug 2, 2025Updated 7 months ago
- Official PyTorch implementation of the ECCV 2022 paper: Efficient Video Transformers with Spatial-Temporal Token Selection.☆51Jul 13, 2022Updated 3 years ago
- A comfyui costume node by BillBum for using api gen (VLM LLM T2I API Tools)☆10Feb 4, 2026Updated 3 weeks ago