"Object-Region Video Transformers”, Herzig et al., CVPR 2022
☆50Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for ORViT
Users that are interested in ORViT are comparing it to the libraries listed below
Sorting:
- Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).☆27Apr 3, 2022Updated 3 years ago
- [ACM MM 2021] A causal perspective for compositional action recognition, providing a counterfactual debiasing inference implementation to…☆20May 5, 2022Updated 3 years ago
- Code repository for the paper: 'Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks'☆148Aug 25, 2023Updated 2 years ago
- ☆10Jan 3, 2023Updated 3 years ago
- 【ACMMM'2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning☆42Jul 7, 2021Updated 4 years ago
- AFNet(NeurIPS 2022)☆20Nov 24, 2022Updated 3 years ago
- This repository is a fork of https://github.com/joslefaure/HIT customized for the AVA dataset☆17Jun 17, 2023Updated 2 years ago
- [NeurIPS 2023] Learning Motion Refinement for Unsupervised Face Animation☆40Dec 3, 2023Updated 2 years ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Jul 20, 2024Updated last year
- [ICCV 2023] Official implementation of paper "SOAR: Scene-debiasing Open-set Action Recognition".☆12Dec 23, 2023Updated 2 years ago
- MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)☆30Sep 5, 2023Updated 2 years ago
- Materials for PyCon 2016 in Portland, Oregon☆10Aug 30, 2015Updated 10 years ago
- Video-Language Alignment via Spatio–Temporal Graph Transformer; ArXiv: https://arxiv.org/abs/2407.11677☆14Jul 24, 2024Updated last year
- [AAAI'25]: Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP☆19Aug 5, 2025Updated 7 months ago
- Pytorch Implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori , ECCV 2018☆170Sep 11, 2018Updated 7 years ago
- Official project of DiverseSampling (ACMMM2022 Paper)☆16Feb 25, 2023Updated 3 years ago
- [CVPR 2023] STMixer: A One-Stage Sparse Action Detector☆63May 18, 2023Updated 2 years ago
- 🚴♂️ ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection (MM 2020)☆35Jul 2, 2025Updated 8 months ago
- Is Depth Really Necessary for Salient Object Detection? ACM MM 2020☆22May 30, 2024Updated last year
- [ICLR2026] The code for "Interp3D: Correspondence-Aware Interpolation for Generative Textured 3D Morphing."☆26Jan 21, 2026Updated 2 months ago
- Official Pytorch Implementation of Relational Self-Attention, NeurIPS 2021☆49Dec 7, 2021Updated 4 years ago
- A UI automation engine☆11Updated this week
- PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.☆126Oct 29, 2025Updated 4 months ago
- [ICCVW 2023] Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection☆21Feb 22, 2024Updated 2 years ago
- [CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation☆32Oct 18, 2024Updated last year
- A simple tkinter GUI for illustrating DFS and BFS.☆12Jun 26, 2020Updated 5 years ago
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆11Feb 13, 2024Updated 2 years ago
- A tookbox for evaluating salient object detection algorithms☆21Jan 20, 2014Updated 12 years ago
- A Rideshare Simulation built in C++, using OpenStreetMap data☆14Oct 24, 2021Updated 4 years ago
- Code for the paper "Generalizing Hand Segmentation in Egocentric Videos with Uncertainty-Guided Model Adaptation"☆36Aug 28, 2020Updated 5 years ago
- Official Implementation of our WACV2023 paper: “Holistic Interaction Transformer Network for Action Detection”☆70Jan 9, 2025Updated last year
- [NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training☆1,698Dec 8, 2023Updated 2 years ago
- The source code of our ACM MM 2019 paper "TGG: Transferable Graph Generation for Zero-shot and Few-shot Learning".☆25Feb 29, 2020Updated 6 years ago
- Multi objects distance estimation☆16Jun 22, 2022Updated 3 years ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆17Jun 3, 2025Updated 9 months ago
- Pick and Place project for RoboND Term 1☆17Feb 20, 2018Updated 8 years ago
- Discover feed url by RSS/Atom autodiscovery.☆14Nov 6, 2020Updated 5 years ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆66Jun 28, 2024Updated last year
- ☆87Mar 4, 2024Updated 2 years ago