CVMI-Lab / ResKD
[NeurIPS 2022] Official implementation of the paper "Rethinking Resolution in the Context of Efficient Video Recognition".
☆32Updated last year
Related projects: ⓘ
- Code for the paper "Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundatio…☆22Updated 10 months ago
- [CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning☆33Updated 2 years ago
- [ICCV 2023] MGMAE: Motion Guided Masking for Video Masked Autoencoding☆20Updated 11 months ago
- Code and model for "Multi-dataset Training of Transformers for Robust Action Recognition", NeurIPS 2022 Spotlight☆18Updated last year
- ☆45Updated last year
- Code accompanying Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos (CVPR 2021)☆33Updated 3 years ago
- The offical implemention of JM3D.☆27Updated 11 months ago
- [ECCV2022] Global Spectral Filter Memory Network for Video Object Segmentation☆36Updated 2 years ago
- Official code for "Opening up Open World Tracking" (CVPR 2022)☆54Updated last year
- CVPR 2021 VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild☆29Updated last year
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆22Updated 2 years ago
- [TPAMI 2023] Local-Global Context Aware Transformer for Language-Guided Video Segmentation☆47Updated 8 months ago
- Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes☆18Updated last year
- Code for the paper "Visual Recognition by Request".☆44Updated last year
- Temporal Pyramid Routing For Video Instance Segmentation-T-PAMI-2022☆25Updated last year
- ☆57Updated last year
- ☆40Updated 11 months ago
- Pytorch implementation of "TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut"☆56Updated last year
- [ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation☆19Updated 2 years ago
- [ECCV 2022] 🎵PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation☆56Updated last year
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated last year
- Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021☆28Updated 3 years ago
- ☆34Updated 2 years ago
- Official implementation of "Can Language Understand Depth?"☆73Updated last year
- ☆37Updated last year
- The official implementation of Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation.☆16Updated 2 years ago
- [ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking☆21Updated last week
- [CVPR2022 Oral] VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation☆29Updated 2 years ago
- [CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.☆46Updated 3 years ago
- Unifying Visual Perception by Dispersible Points Learning (ECCV 2022)☆51Updated 2 years ago