Mia-YatingYu/STDD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Mia-YatingYu/STDD)

Mia-YatingYu / STDD

[AAAI'25]: Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP

☆23

Alternatives and similar repositories for STDD

Users that are interested in STDD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

engindeniz / vitis
View on GitHub
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
☆13Jan 13, 2025Updated last year
R00Kie-Liu / TA2N
View on GitHub
TA2N: Two-Stage Action Alignment Network for Few-Shot Action Recognition
☆17Mar 26, 2024Updated 2 years ago
RongchangLi / ZSCAR_C2C
View on GitHub
[ECCV 2024 oral] -C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition
☆43Dec 7, 2024Updated last year
alibaba-mmai-research / CLIP-FSAR
View on GitHub
Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".
☆82Mar 7, 2024Updated 2 years ago
Visual-AI / FROSTER
View on GitHub
[ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition
☆101Jan 14, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
DeLightCMU / ElaborativeRehearsal
View on GitHub
This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)
☆37Apr 9, 2022Updated 4 years ago
jiazheng-xing / SloshNet
View on GitHub
[AAAI2023] Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition (SloshNet)
☆14Jan 10, 2024Updated 2 years ago
Shahzadnit / EZ-CLIP
View on GitHub
☆24May 11, 2025Updated last year
alibaba-mmai-research / HyRSMPlusPlus
View on GitHub
Code for our paper "HyRSM++: Hybrid Relation Guided Temporal Set Matching for Few-shot Action Recognition".
☆15Jan 3, 2023Updated 3 years ago
intel / TVP
View on GitHub
☆15Aug 4, 2025Updated 11 months ago
bofang98 / UATVR
View on GitHub
[ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval
☆13Nov 5, 2023Updated 2 years ago
starrycos / PAINet
View on GitHub
[ICCV'23] PAINet: Parallel Attention Interaction Network for Few-shot Skeleton-based Action Recognition
☆11Oct 14, 2023Updated 2 years ago
minghangz / SPL
View on GitHub
Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization
☆16Jul 20, 2023Updated 3 years ago
alibaba-mmai-research / Masked-Action-Recognition
View on GitHub
Official code for the paper: MAR: Masked Autoencoders for Efficient Action Recognition
☆32Dec 7, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
wlin-at / MAXI
View on GitHub
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)
☆31Sep 5, 2023Updated 2 years ago
huangmozhi9527 / GMMFormer
View on GitHub
[AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
☆21May 10, 2024Updated 2 years ago
zgzxy001 / STMT
View on GitHub
Code for the CVPR'23 paper: "STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition"
☆21Dec 9, 2024Updated last year
yhZhai / SOAR
View on GitHub
[ICCV 2023] Official implementation of paper "SOAR: Scene-debiasing Open-set Action Recognition".
☆12Dec 23, 2023Updated 2 years ago
ctX-u / PLOVAD
View on GitHub
Source codes of our paper in TCSVT 2025: PLOVAD: Prompting Vision-Language Models for Open Vocabulary Video Anomaly Detection
☆33Feb 15, 2025Updated last year
wengzejia1 / Open-VCLIP
View on GitHub
☆119Feb 19, 2024Updated 2 years ago
XiaoBuL / OmniCLIP
View on GitHub
[ECAI-2024] OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
☆16Jan 7, 2025Updated last year
usydnlp / VICTR
View on GitHub
This repository contains code for paper VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
☆14Nov 20, 2021Updated 4 years ago
ByZ0e / Glance-Focus
View on GitHub
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
☆31Jun 28, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
skelemoa / synse-zsl
View on GitHub
Official PyTorch code for the ICIP 2021 paper 'Syntactically Guided Generative Embeddings For Zero Shot Skeleton Action Recognition'
☆31Mar 17, 2023Updated 3 years ago
whwu95 / BIKE
View on GitHub
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
☆156Sep 9, 2024Updated last year
HuiGuanLab / HiCo
View on GitHub
This is a repository contains the implementation of our AAAI'23 oral paper Hierarchical Contrast for Unsupervised Skeleton-based Action R…
☆31Feb 15, 2023Updated 3 years ago
JasonCodeMaker / CTVR
View on GitHub
☆16Jun 2, 2025Updated last year
marco-garosi / ComCa
View on GitHub
Official implementation of the CVPR '25 highlight paper "Compositional Caching for Training-free Open-vocabulary Attribute Detection"
☆23Dec 23, 2024Updated last year
icq-benchmark / icq-benchmark
View on GitHub
☆19Jul 28, 2025Updated 11 months ago
OSVAI / Ske2Grid
View on GitHub
The official project website of "Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition" (The paper of Ske2Grid is pub…
☆19Sep 6, 2023Updated 2 years ago
bladewaltz1 / PromptSwitch
View on GitHub
☆30Aug 14, 2023Updated 2 years ago
AmeenAli / VideoMatch
View on GitHub
☆14Jan 5, 2022Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Liuxiyao / SADG-Net-for-HOI
View on GitHub
Code for Semantic-Aware Dynamic Generation Networks for Few-Shot Human-Object Interaction Recognition
☆10May 26, 2021Updated 5 years ago
Westlake-AI / SemiReward
View on GitHub
[ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning
☆76Nov 9, 2025Updated 8 months ago
fuyunwang / DPDL
View on GitHub
Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection CVPR 2025
☆28Feb 28, 2025Updated last year
ChenJiayi68 / DMTNet
View on GitHub
☆12Aug 15, 2024Updated last year
emu1729 / GIST
View on GitHub
Generating Image Specific Text
☆29Aug 14, 2023Updated 2 years ago
xiaoxing2001 / DeGLA
View on GitHub
[ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]
☆16Jul 15, 2025Updated last year
VinAIResearch / fsvc-ata
View on GitHub
Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments (ECCV 2022)
☆26Nov 12, 2024Updated last year