sail-sg / dualformer
☆30Updated 2 years ago
Alternatives and similar repositories for dualformer:
Users that are interested in dualformer are comparing it to the libraries listed below
- Official PyTorch implementation of the ECCV 2022 paper: Efficient Video Transformers with Spatial-Temporal Token Selection.☆46Updated 2 years ago
- This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)☆36Updated 2 years ago
- [CVPR 2023]Implementation of Siamese Image Modeling for Self-Supervised Vision Representation Learning☆35Updated 8 months ago
- ☆47Updated 2 years ago
- ☆16Updated last year
- Video Test-Time Adaptation for Action Recognition (CVPR 2023)☆39Updated 4 months ago
- Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.☆28Updated 2 years ago
- ☆20Updated last year
- Test different pooling method used in CNN for Computer Vision Task☆35Updated 4 years ago
- [CVPR 2022 Oral] Towards Open Set Temporal Action Localization☆49Updated last year
- Turning to Video for Transcript Sorting☆48Updated last year
- Code for Motion-aware Contrastive Video Representation Learning via Foreground-background Merging (CVPR 2022)☆46Updated last year
- TCPNet☆30Updated 3 years ago
- [IEEE T-IP 2022] TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning☆24Updated last year
- [CVPR2022 Oral] The official code for "TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognit…☆18Updated 2 years ago
- [ICLR2024] Exploring Target Representations for Masked Autoencoders☆52Updated last year
- ☆70Updated last year
- [CVPR'23] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders☆74Updated last year
- code base for vision transformers☆36Updated 3 years ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆14Updated last year
- ☆27Updated 2 years ago
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Updated last year
- PyTorch Implementation of "Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model"☆48Updated 2 years ago
- [CVPR 2022] Cross-Architecture Self-supervised Video Representation Learning☆22Updated 2 years ago
- AFNet(NeurIPS 2022)☆19Updated 2 years ago
- repo for paper titled: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment (AAAI'24 Oral)☆25Updated 9 months ago
- Lightweight Transformer for Multi-modal Tasks☆15Updated 2 years ago
- A Toolkit for Video Action Recognition(Classification/Detection)☆16Updated 2 years ago
- ☆16Updated 2 years ago
- Official codes for ConMIM (ICLR 2023)☆58Updated 2 years ago