wdrink/STTS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wdrink/STTS)

wdrink / STTS

Official PyTorch implementation of the ECCV 2022 paper: Efficient Video Transformers with Spatial-Temporal Token Selection.

☆52

Alternatives and similar repositories for STTS

Users that are interested in STTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LeapLabTHU / AdaFocusV2
View on GitHub
[CVPR 2022] Official repository of AdaFocusV2.
☆91Dec 15, 2024Updated last year
Francis-Rings / ILA
View on GitHub
[ICCV2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition
☆41Nov 29, 2023Updated 2 years ago
VideoNetworks / TokShift-Transformer
View on GitHub
☆70Oct 6, 2023Updated 2 years ago
wentianli / MRI_RL
View on GitHub
☆14May 16, 2021Updated 5 years ago
CVIR / CoMix
View on GitHub
This repository contains the official implementation of CoMix (NeurIPS 2021) https://arxiv.org/pdf/2110.15128.pdf.
☆22Jan 12, 2022Updated 4 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
ruitian12 / resformer
View on GitHub
Official PyTorch implementation of ResFormer: Scaling ViTs with Multi-Resolution Training, CVPR2023
☆30Jun 22, 2023Updated 3 years ago
cg1177 / DCAN
View on GitHub
[AAAI 2022] DCAN: Improving Temporal Action Detection via Dual Context Aggregation
☆17Nov 13, 2022Updated 3 years ago
doc-doc / CoVGT
View on GitHub
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
☆20Mar 9, 2024Updated 2 years ago
doc-doc / NExT-GQA
View on GitHub
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
☆89Jul 1, 2024Updated 2 years ago
wdrink / OpenTokenizer
View on GitHub
☆21Jan 17, 2025Updated last year
wengzejia1 / Semiformer
View on GitHub
☆36Nov 4, 2022Updated 3 years ago
MingTian99 / RSDformer
View on GitHub
Learning An Effective Transformer for Remote Sensing Satellite Image Dehazing
☆12Sep 25, 2023Updated 2 years ago
shvdiwnkozbw / Video-Representation-via-Multi-level-Optimization
View on GitHub
Code for Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization.
☆10Sep 28, 2021Updated 4 years ago
MengLcool / SliMM
View on GitHub
☆25Dec 26, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
monjurulkarim / ROL_Dataset
View on GitHub
Risky Object Localization (ROL) in a Driving Scene Dataset
☆15Dec 24, 2023Updated 2 years ago
md-mohaiminul / ViS4mer
View on GitHub
☆58Dec 2, 2025Updated 7 months ago
mengcaopku / LocVTP
View on GitHub
[ECCV 22] LocVTP: Video-Text Pre-training for Temporal Localization
☆39Jul 29, 2022Updated 3 years ago
wdrink / PyDeepFakeDet
View on GitHub
PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.
☆114Nov 6, 2022Updated 3 years ago
naver-ai / muco
View on GitHub
Official Pytorch implementation of MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model (CVPR 2026)
☆15Apr 16, 2026Updated 3 months ago
BeSpontaneous / AFNet-pytorch
View on GitHub
AFNet(NeurIPS 2022)
☆20Nov 24, 2022Updated 3 years ago
svip-lab / SVIP-Sequence-VerIfication-for-Procedures-in-Videos
View on GitHub
[CVPR2022] SVIP: Sequence VerIfication for Procedures in Videos
☆24Feb 24, 2023Updated 3 years ago
blackfeather-wang / Dynamic-Vision-Transformer
View on GitHub
Accelerating T2t-ViT by 1.6-3.6x.
☆260Nov 25, 2021Updated 4 years ago
xlliu7 / E2E-TAD
View on GitHub
[CVPR 2022] An Empirical Study of End-to-end Temporal Action Detection
☆87Feb 19, 2023Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
klauscc / VindLU
View on GitHub
☆109Dec 23, 2022Updated 3 years ago
wision-lab / eventful-transformer
View on GitHub
Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"
☆39Jan 27, 2026Updated 5 months ago
jiaozizhao / Two-in-One-ActionDetection
View on GitHub
The code is for the CVPR 2019 paper 'Dance with Flow: Two-in-One Stream for Action Detection '
☆32Nov 21, 2022Updated 3 years ago
showlab / mist
View on GitHub
☆37Dec 20, 2023Updated 2 years ago
mengyuest / AdaFuse
View on GitHub
[ICLR2021] AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
☆35Apr 8, 2021Updated 5 years ago
YangLiu9208 / TCGL
View on GitHub
[IEEE T-IP 2022] TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning
☆24Dec 19, 2023Updated 2 years ago
alinlab / temporal-selfsupervision
View on GitHub
☆33Jul 28, 2022Updated 3 years ago
JPShi12 / VideoLoom
View on GitHub
[ICML 2026] VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding
☆27Jul 3, 2026Updated 2 weeks ago
Li-ZK / MFDN-2020
View on GitHub
Deep Multi-layer Fusion Dense Network for Hyperspectral Image Classification.
☆11Apr 25, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
xuyu0010 / ATCoN
View on GitHub
Repository for ECCV 2022 paper "Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition"
☆24Mar 9, 2023Updated 3 years ago
LiuRicky / ts2_net
View on GitHub
[ECCV 2022] A pytorch implementation for TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
☆80Nov 29, 2022Updated 3 years ago
OpenGVLab / efficient-video-recognition
View on GitHub
☆184Aug 20, 2022Updated 3 years ago
rishikksh20 / ViViT-pytorch
View on GitHub
Implementation of ViViT: A Video Vision Transformer
☆559Jun 21, 2021Updated 5 years ago
blackfeather-wang / AdaFocus
View on GitHub
Reducing spatial redundancy in video recognition. SOTA computational efficiency.
☆128Dec 15, 2024Updated last year
StanfordVL / atp-video-language
View on GitHub
Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (…
☆51May 29, 2024Updated 2 years ago
OpenGVLab / UniFormerV2
View on GitHub
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
☆350Apr 2, 2024Updated 2 years ago