Augusta-A / Awesome-EfficientVideo
☆13Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for Awesome-EfficientVideo
- [CVPR2022 Oral] The official code for "TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognit…☆18Updated 2 years ago
- [CVPR2022] Unsupervised Pre-training for Temporal Action Localization Tasks (UP-TAL)☆29Updated 2 years ago
- AFNet(NeurIPS 2022)☆19Updated last year
- ☆47Updated 2 years ago
- Teach-DETR: Better Training DETR with Teachers☆29Updated 7 months ago
- RF-Next: Efficient Receptive Field Search for CNN(TPAMI2022, CVPR2021) Try it, you wouldn't regret it!☆63Updated last year
- [ECCV 2022] Official Pytorch Implementation of paper : " Proposal-Free Temporal Action Detection with Global Segmentation Mask Learning "…☆18Updated 2 years ago
- [CVPR 2022] OCSampler: Compressing Videos to One Clip with Single-step Sampling☆17Updated 2 years ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆39Updated last year
- Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)☆33Updated 2 years ago
- ☆32Updated 11 months ago
- ☆17Updated 2 years ago
- [ECCV 2022] AMixer: Adaptive Weight Mixing for Self-attention Free Vision Transformers☆28Updated last year
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Updated last year
- Benchmarking Attention Mechanism in Vision Transformers.☆16Updated 2 years ago
- [ICCV 23]This is a Pytorch implementation of our paper "SMMix: Self-Motivated Image Mixing for Vision Transformers"☆17Updated last year
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆31Updated last year
- ☆28Updated last year
- Test different pooling method used in CNN for Computer Vision Task☆35Updated 3 years ago
- [TPAMI 2023] Local-Global Context Aware Transformer for Language-Guided Video Segmentation☆48Updated 9 months ago
- Reducing spatial redundancy in video recognition. SOTA computational efficiency.☆122Updated 2 years ago
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆61Updated 7 months ago
- ☆69Updated last month
- Code and models for the paper Glance-and-Gaze Vision Transformer☆28Updated 3 years ago
- code base for vision transformers☆35Updated 2 years ago
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆14Updated 9 months ago
- [ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension☆14Updated 2 years ago
- Turning to Video for Transcript Sorting☆46Updated last year
- [CVPR 2022 Oral] Towards Open Set Temporal Action Localization☆50Updated last year
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago