[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition
☆70Dec 23, 2024Updated last year
Alternatives and similar repositories for m2clip
Users that are interested in m2clip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆11Jun 11, 2024Updated 2 years ago
- Placeholder☆10Jul 17, 2023Updated 2 years ago
- ☆42Apr 7, 2024Updated 2 years ago
- ☆119Feb 19, 2024Updated 2 years ago
- Taylor videos and Taylor-transformed skeletons (ICML 2024).☆17Jul 25, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆45Mar 11, 2025Updated last year
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Sep 25, 2023Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"☆102Feb 25, 2025Updated last year
- A simple but efficient transformer model for video action recognition☆64Oct 8, 2022Updated 3 years ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆126Jul 1, 2023Updated 3 years ago
- [ACMMM 2024] Implementation of the paper “Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition“.☆48Mar 21, 2025Updated last year
- Convolutional Initialization for Data-Efficient Vision Transformers☆15Dec 9, 2025Updated 6 months ago
- Pytorch implementation for "Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning" (ICML 2024)☆27May 11, 2025Updated last year
- [IEEE TIP] Offical implementation for the work "BadCM: Invisible Backdoor Attack against Cross-Modal Learning".☆14Aug 30, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆49Nov 12, 2022Updated 3 years ago
- 【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models☆156Sep 9, 2024Updated last year
- Modality-Invariant Temporal Representation Learning☆23Apr 21, 2023Updated 3 years ago
- Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition☆25Jul 12, 2022Updated 3 years ago
- a TensorFlow implementation of the paper "Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Ima…☆13Nov 30, 2021Updated 4 years ago
- 生成中文文字识别(OCR)的训练数据☆12Mar 2, 2020Updated 6 years ago
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- ☆12Dec 14, 2023Updated 2 years ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 10 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".☆82Mar 7, 2024Updated 2 years ago
- Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]☆24Oct 20, 2023Updated 2 years ago
- A series of face anti-spoofing datasets, for the convenience of management and benchmarking.☆17May 12, 2026Updated last month
- [ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition☆301Sep 17, 2023Updated 2 years ago
- 生成为所欲为动图,灵感来自于sorry项目☆11Mar 28, 2020Updated 6 years ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings☆11Feb 24, 2025Updated last year
- ☆184Aug 20, 2022Updated 3 years ago
- [ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition☆102Jan 14, 2025Updated last year
- GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?☆184May 22, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆10Dec 17, 2024Updated last year
- ☆15May 12, 2025Updated last year
- Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentatio…☆103Jan 23, 2026Updated 5 months ago
- [ICCV 2025] Official implementation of 'SEAL: Semantic Aware Image Watermarking'☆23May 11, 2026Updated last month
- The official implementation of the paper "Asymmetric Polynomial Loss for Multi-Label Classification"(ICASSP 2023)☆20Apr 5, 2023Updated 3 years ago
- Source codes of our paper in TCSVT 2025: PLOVAD: Prompting Vision-Language Models for Open Vocabulary Video Anomaly Detection☆31Feb 15, 2025Updated last year
- [ECCV 2022] Code for the paper, ReAct: Temporal Action Detection with Relational Queries☆39Oct 19, 2022Updated 3 years ago