MCG-NJU / AMD
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
☆13Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for AMD
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆61Updated 7 months ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆39Updated last year
- ☆36Updated 7 months ago
- ☆47Updated 2 years ago
- ☆32Updated 11 months ago
- CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models☆57Updated 8 months ago
- [ICCV'23] Official PyTorch implementation for paper "Exploring Predicate Visual Context in Detecting Human-Object Interactions"☆68Updated 4 months ago
- [T-PAMI 2023] Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection☆35Updated last year
- [CVPR'23] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders☆73Updated 9 months ago
- [ICCV 2023] Official implementation of Memory-and-Anticipation Transformer for Online Action Understanding☆45Updated last year
- Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".☆49Updated 8 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated last month
- [ICCV 2023] MGMAE: Motion Guided Masking for Video Masked Autoencoding☆20Updated last year
- SeqTR: A Simple yet Universal Network for Visual Grounding☆131Updated 3 weeks ago
- Tracking with Human-Intent Reasoning☆66Updated 3 weeks ago
- [ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.☆32Updated last month
- 【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition☆36Updated 6 months ago
- ☆71Updated last year
- [AAAI 2024] Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆69Updated 4 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆42Updated 4 months ago
- Official code for the paper: MAR: Masked Autoencoders for Efficient Action Recognition☆31Updated last year
- ☆33Updated last year
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆110Updated last year
- Self-Supervised Video Representation Learning with Motion-Aware Masked Autoencoders☆23Updated 4 months ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆56Updated last year
- [ECCV 2024] Code for Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation☆27Updated 4 months ago
- This is an official implementation for "Making Vision Transformers Efficient from A Token Sparsification View".☆30Updated 5 months ago
- [NeurIPS 2023] Rank-DETR for High Quality Object Detection☆87Updated last year
- Referring Video Object Segmentation / Multi-Object Tracking Repo☆87Updated last year
- Video Test-Time Adaptation for Action Recognition (CVPR 2023)☆36Updated last month