OpenGVLab / UniFormerV2Links

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

☆325

Alternatives and similar repositories for UniFormerV2

Users that are interested in UniFormerV2 are comparing it to the libraries listed below

Sorting:

ruiwang2021 / mvd
[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…
☆133Updated 2 years ago
sallymmx / ActionCLIP
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆570Updated last year
dingfengshi / TriDet
[CVPR2023] Code for the paper, TriDet: Temporal Action Detection with Relative Boundary Modeling
☆189Updated last year
OpenGVLab / VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
☆669Updated 9 months ago
OpenGVLab / efficient-video-recognition
☆176Updated 2 years ago
daniel-code / TubeViT
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
☆92Updated 10 months ago
happyharrycn / actionformer_release
Code release for ActionFormer (ECCV 2022)
☆508Updated last year
sming256 / OpenTAD
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
☆276Updated 3 months ago
MCG-NJU / VideoMAE-Action-Detection
[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
☆67Updated 2 years ago
mx-mark / VideoTransformer-pytorch
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
☆300Updated 3 years ago
OpenGVLab / unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
☆336Updated last year
alibaba-mmai-research / TAdaConv
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, vi…
☆240Updated last year
taoyang1122 / adapt-image-models
[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
☆291Updated last year
OpenGVLab / video-mamba-suite
The suite of modeling video with Mamba
☆280Updated last year
ju-chen / Efficient-Prompt
☆193Updated 2 years ago
xyzforever / BEVT
PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529
☆163Updated 3 years ago
muzairkhattak / ViFi-CLIP
[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
☆285Updated last year
Tramac / tiny-kinetics-400
Tiny Kinetics-400 for test
☆93Updated last year
haofanwang / video-swin-transformer-pytorch
Video Swin Transformer - PyTorch
☆260Updated 3 years ago
dingfengshi / tridetplus
Code for the paper, Temporal Action Localization with Enhanced Instant Discriminability
☆26Updated last year
sming256 / AdaTAD
[CVPR2024] The official implementation of AdaTAD: End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
☆36Updated last year
BeSpontaneous / FFN-pytorch
Frame Flexible Network (CVPR2023)
☆56Updated 2 years ago
yingsen1 / UniMD
UniMD: Towards Unifying Moment retrieval and temporal action Detection
☆51Updated last year
TencentARC / UMT
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or …
☆223Updated last year
amazon-science / tubelet-transformer
This is an official implementation of TubeR: Tubelet Transformer for Video Action Detection
☆81Updated 2 years ago
wjn922 / ReferFormer
[CVPR2022] Official Implementation of ReferFormer
☆343Updated 5 months ago
TuanTNG / TemporalMaxer
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization
☆58Updated 2 years ago
microsoft / SwinBERT
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
☆240Updated 3 years ago
xlliu7 / E2E-TAD
[CVPR 2022] An Empirical Study of End-to-end Temporal Action Detection
☆85Updated 2 years ago
facebookresearch / mvit
Code Release for MViTv2 on Image Recognition.
☆435Updated 8 months ago