Vision-CAIR / MammalNet
☆25Updated 5 months ago
Related projects: ⓘ
- [CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…☆101Updated last year
- [CVPR2022] Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding☆126Updated 7 months ago
- Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]☆20Updated 11 months ago
- [CVPR'23] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders☆71Updated 7 months ago
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆55Updated 5 months ago
- [ECCV 2024] PyTorch implementation of CropMAE, introduced in "Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders"☆43Updated 2 months ago
- ☆35Updated 3 months ago
- Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23☆74Updated 4 months ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆84Updated 4 months ago
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)☆100Updated 2 months ago
- [CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detect…☆39Updated last month
- Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]☆85Updated 2 months ago
- "Object-Region Video Transformers”, Herzig et al., CVPR 2022☆42Updated 2 years ago
- [ICCV 2023] "Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity"☆84Updated 3 months ago
- ☆45Updated last year
- [NeurIPS 2022 Spotlight] VideoMAE for Action Detection☆48Updated last year
- MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)☆25Updated last year
- Pytorch implementation of "TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut"☆56Updated last year
- Official codes for the paper "In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation".☆19Updated 3 weeks ago
- ☆37Updated 8 months ago
- Pytorch code for Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning, CVPR2022.☆83Updated last year
- ☆87Updated 2 months ago
- ☆29Updated last year
- ☆165Updated 2 years ago
- [CVPR 2022] OCSampler: Compressing Videos to One Clip with Single-step Sampling☆17Updated 2 years ago
- BEAR: a new BEnchmark on video Action Recognition☆40Updated 4 months ago
- ☆67Updated last year
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆240Updated 5 months ago
- The official implementation of our paper "Sports Video Analysis on Large-scale Data" (https://arxiv.org/abs/2208.04897)☆56Updated last year
- An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"☆83Updated last week