"Object-Region Video Transformers”, Herzig et al., CVPR 2022
☆50Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for ORViT
Users that are interested in ORViT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Object-Region Video Transformers☆24Mar 24, 2022Updated 4 years ago
- Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).☆27Apr 3, 2022Updated 4 years ago
- ☆10Jan 3, 2023Updated 3 years ago
- 【ACMMM'2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning☆41Jul 7, 2021Updated 4 years ago
- AFNet(NeurIPS 2022)☆20Nov 24, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repository is a fork of https://github.com/joslefaure/HIT customized for the AVA dataset☆17Jun 17, 2023Updated 3 years ago
- N-EPIC-Kitchens: The event-based camera extension of the large-scale EPIC-Kitchens dataset.☆23May 10, 2022Updated 4 years ago
- ☆13Nov 29, 2021Updated 4 years ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆22Jul 20, 2024Updated last year
- [ICCV 2023] Official implementation of paper "SOAR: Scene-debiasing Open-set Action Recognition".☆12Dec 23, 2023Updated 2 years ago
- A zero-shot captcha solver.☆16Dec 22, 2023Updated 2 years ago
- MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)☆31Sep 5, 2023Updated 2 years ago
- Video-Language Alignment via Spatio–Temporal Graph Transformer; ArXiv: https://arxiv.org/abs/2407.11677☆15Jul 24, 2024Updated last year
- ☆12Aug 5, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Pytorch Implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori , ECCV 2018☆170Sep 11, 2018Updated 7 years ago
- Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"☆12Mar 26, 2026Updated 3 months ago
- [CVPR 2023] STMixer: A One-Stage Sparse Action Detector☆64May 18, 2023Updated 3 years ago
- 🚴♂️ ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection (MM 2020)☆35Jul 2, 2025Updated last year
- Implementation of the paper Video Action Transformer Network☆138Apr 5, 2021Updated 5 years ago
- Is Depth Really Necessary for Salient Object Detection? ACM MM 2020☆22May 30, 2024Updated 2 years ago
- [ICLR2026] The code for "Interp3D: Correspondence-Aware Interpolation for Generative Textured 3D Morphing."☆31Jan 21, 2026Updated 5 months ago
- Implementation of 3D attention mechanisms based on https://github.com/LeftAttention/Attention-Codebase. Thanks to LeftAttetnion for shari…☆12Feb 22, 2022Updated 4 years ago
- Official Pytorch Implementation of Relational Self-Attention, NeurIPS 2021☆49Dec 7, 2021Updated 4 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- EPIC-Kitchens-100 Action Recognition baselines: TSN, TRN, TSM☆33Mar 15, 2022Updated 4 years ago
- [ICCVW 2023] Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection☆21Feb 22, 2024Updated 2 years ago
- [CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation☆32Oct 18, 2024Updated last year
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆11Feb 13, 2024Updated 2 years ago
- The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch☆16Apr 22, 2019Updated 7 years ago
- ☆60Sep 14, 2024Updated last year
- ☆109Dec 23, 2022Updated 3 years ago
- Code and Dataset for our CVPR 2022 paper "Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training"☆12Jul 8, 2022Updated 3 years ago
- Multi-head Recurrent Layer Attention for Vision Network☆23Mar 2, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official implementation for "GLASS: Global to Local Attention for Scene-Text Spotting" (ECCV'22)☆102Jun 28, 2024Updated 2 years ago
- Official Implementation of our WACV2023 paper: “Holistic Interaction Transformer Network for Action Detection”☆72Jan 9, 2025Updated last year
- [NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training☆1,761Dec 8, 2023Updated 2 years ago
- ☆14Mar 31, 2022Updated 4 years ago
- Pretraining summarization models using a corpus of nonsense☆13Sep 28, 2021Updated 4 years ago
- Official Code for MIMETIC^2☆13Nov 19, 2024Updated last year
- The source code of our ACM MM 2019 paper "TGG: Transferable Graph Generation for Zero-shot and Few-shot Learning".☆25Feb 29, 2020Updated 6 years ago