sallymmx/m2clip

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sallymmx/m2clip)

sallymmx / m2clip

[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition

☆70

Alternatives and similar repositories for m2clip

Users that are interested in m2clip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sallymmx / ActionCLIP
View on GitHub
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆613Dec 6, 2023Updated 2 years ago
MCG-NJU / ViT-TAD
View on GitHub
[CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
☆11Jun 11, 2024Updated 2 years ago
DAVEISHAN / TimeBalance
View on GitHub
Placeholder
☆10Jul 17, 2023Updated 3 years ago
wengzejia1 / Open-VCLIP
View on GitHub
☆119Feb 19, 2024Updated 2 years ago
LeiWangR / video-ar
View on GitHub
Taylor videos and Taylor-transformed skeletons (ICML 2024).
☆17Jul 25, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kunli-cs / PCAN
View on GitHub
[AAAI 2025] Official implementation of the paper: Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
☆15Jul 16, 2025Updated last year
alibaba-mmai-research / DiST
View on GitHub
ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
☆41Sep 25, 2023Updated 2 years ago
MartinXM / TPS
View on GitHub
A simple but efficient transformer model for video action recognition
☆64Oct 8, 2022Updated 3 years ago
uestc-lj / CFFN
View on GitHub
☆18Oct 10, 2023Updated 2 years ago
osiriszjq / impulse_init
View on GitHub
Convolutional Initialization for Data-Efficient Vision Transformers
☆15Dec 9, 2025Updated 7 months ago
xandery-geek / BadCM
View on GitHub
[IEEE TIP] Offical implementation for the work "BadCM: Invisible Backdoor Attack against Cross-Modal Learning".
☆14Aug 30, 2024Updated last year
sijieaaa / DistilVPR
View on GitHub
(AAAI 2024) DistilVPR: Cross-Modal Knowledge Distillation for Visual Place Recognition
☆27Apr 15, 2024Updated 2 years ago
park-jungin / DualPath
View on GitHub
☆49Nov 12, 2022Updated 3 years ago
kiva12138 / MITRL
View on GitHub
Modality-Invariant Temporal Representation Learning
☆23Apr 21, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
whwu95 / BIKE
View on GitHub
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
☆156Sep 9, 2024Updated last year
chengzju / CARAT
View on GitHub
☆25Apr 16, 2025Updated last year
naver-ai / tc-clip
View on GitHub
[ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"
☆102Feb 25, 2025Updated last year
mininglamp-MLLM / HMLLM
View on GitHub
[ACM MM2024] The code for HMLLM.
☆11Oct 27, 2024Updated last year
junwenzhu / chinese-image-generator
View on GitHub
生成中文文字识别（OCR）的训练数据
☆12Mar 2, 2020Updated 6 years ago
alibaba-mmai-research / CLIP-FSAR
View on GitHub
Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".
☆82Mar 7, 2024Updated 2 years ago
Ferrum5 / Sorry-Android
View on GitHub
生成为所欲为动图，灵感来自于sorry项目
☆11Mar 28, 2020Updated 6 years ago
hananshafi / MTL-ViT
View on GitHub
A new multi-task learning framework using Vision Transformers
☆11Jun 19, 2024Updated 2 years ago
XiaoBuL / OmniCLIP
View on GitHub
[ECAI-2024] OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
☆16Jan 7, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
mondalanindya / MSQNet
View on GitHub
Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]
☆24Oct 20, 2023Updated 2 years ago
nanfangAlan / FSRFER
View on GitHub
a TensorFlow implementation of the paper "Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Ima…
☆13Nov 30, 2021Updated 4 years ago
cfmata / CoPT
View on GitHub
[ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings
☆10Feb 24, 2025Updated last year
taoyang1122 / adapt-image-models
View on GitHub
[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
☆298Sep 17, 2023Updated 2 years ago
bruceyo / TSMF
View on GitHub
Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition
☆25Jul 12, 2022Updated 4 years ago
OpenGVLab / efficient-video-recognition
View on GitHub
☆184Aug 20, 2022Updated 3 years ago
Gank0078 / FineSSL
View on GitHub
Pytorch implementation for "Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning" (ICML 2024)
☆27May 11, 2025Updated last year
LinfengYuan1997 / LoSh
View on GitHub
[CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation
☆13Jun 17, 2024Updated 2 years ago
ctX-u / PLOVAD
View on GitHub
Source codes of our paper in TCSVT 2025: PLOVAD: Prompting Vision-Language Models for Open Vocabulary Video Anomaly Detection
☆33Feb 15, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
EavnJeong / IEF-VAD
View on GitHub
☆15May 12, 2025Updated last year
CGuangyan-BIT / MRA
View on GitHub
[ICCV 2023] Rethinking Point Cloud Registration as Masking and Reconstruction
☆10Aug 14, 2023Updated 2 years ago
Youjiangbaba / Dijkstra_travel_path
View on GitHub
☆13Jun 4, 2020Updated 6 years ago
ThomasWangY / 2024-AAAI-HPT
View on GitHub
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)
☆75Feb 3, 2025Updated last year
dingfengshi / ReAct
View on GitHub
[ECCV 2022] Code for the paper, ReAct: Temporal Action Detection with Relational Queries
☆39Oct 19, 2022Updated 3 years ago
x4Cx58x54 / vistal
View on GitHub
A visualization tool for temporal action localization (detection/segmentation).
☆13Mar 30, 2023Updated 3 years ago
Jianf-Wang / NP-SemiSeg
View on GitHub
A Pytorch implementation of ICML 2023 paper "NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation"
☆36Dec 2, 2023Updated 2 years ago