[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition
☆72Dec 23, 2024Updated last year
Alternatives and similar repositories for m2clip
Users that are interested in m2clip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"☆604Dec 6, 2023Updated 2 years ago
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆12Jun 11, 2024Updated last year
- ☆42Apr 7, 2024Updated last year
- ☆119Feb 19, 2024Updated 2 years ago
- Taylor videos and Taylor-transformed skeletons (ICML 2024).☆16Jul 25, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated last year
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Sep 25, 2023Updated 2 years ago
- A simple but efficient transformer model for video action recognition☆62Oct 8, 2022Updated 3 years ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆126Jul 1, 2023Updated 2 years ago
- ☆17Oct 10, 2023Updated 2 years ago
- Convolutional Initialization for Data-Efficient Vision Transformers☆15Dec 9, 2025Updated 3 months ago
- (AAAI 2024) DistilVPR: Cross-Modal Knowledge Distillation for Visual Place Recognition☆26Apr 15, 2024Updated last year
- ☆49Nov 12, 2022Updated 3 years ago
- 【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models☆154Sep 9, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Modality-Invariant Temporal Representation Learning☆22Apr 21, 2023Updated 2 years ago
- Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition☆25Jul 12, 2022Updated 3 years ago
- a TensorFlow implementation of the paper "Feature Super-Resolution Based Facial Expression Recognition for Multi-scale Low-Resolution Ima…☆13Nov 30, 2021Updated 4 years ago
- A new multi-task learning framework using Vision Transformers☆11Jun 19, 2024Updated last year
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".☆78Mar 7, 2024Updated 2 years ago
- Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]☆24Oct 20, 2023Updated 2 years ago
- [ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition☆301Sep 17, 2023Updated 2 years ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings☆11Feb 24, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆182Aug 20, 2022Updated 3 years ago
- [ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition☆97Jan 14, 2025Updated last year
- Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentatio…☆90Jan 23, 2026Updated 2 months ago
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- OBD-II Data Based Driver Identification System Based on Deep-LSTM☆12Jul 13, 2020Updated 5 years ago
- This repository contains our codebase for the method CABINET that tackles the task of Table Question Answering and achieves state-of-the-…☆13Jul 16, 2024Updated last year
- This is the repository to the article "NEWBEE: A Multi-Modal Gait Database of Natural Everyday-Walk in an Urban Environment", 2022☆11Aug 2, 2022Updated 3 years ago
- Code and Dataset for the paper "LAMM: Label Alignment for Multi-Modal Prompt Learning" AAAI 2024☆34Jan 3, 2024Updated 2 years ago
- A Pytorch implementation of ICML 2023 paper "NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation"☆36Dec 2, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆85May 8, 2023Updated 2 years ago
- Code for Adaptation Network introduced in "Block-wise Scrambled Image Recognition Using Adaptation Network" paper (AAAI WS 2020)☆12Dec 3, 2019Updated 6 years ago
- Paper has been accepted in ACM MM 2024.☆13Jul 4, 2025Updated 8 months ago
- A simple and effective feature extractor for untrimmed videos☆13Sep 1, 2022Updated 3 years ago
- ☆18Apr 13, 2022Updated 3 years ago
- End-to-end implementation of the Social Graph Network (SGN), described in the Structural Reasoning for Image-based Social Relation Recogn…☆13Apr 3, 2024Updated last year
- Global-local Multimodal Fusion Driving Behavior Classification Network☆12Feb 2, 2023Updated 3 years ago