eanson023 / rehamot
Official PyTorch implementation of the paper " Cross-Modal Retrieval for Motion and Text via DropTriple Loss "
☆36Updated 4 months ago
Related projects: ⓘ
- IGANet, single-frame based 3D human pose estimation☆51Updated last year
- [TCSVT 2024] Official PyTorch implementation of the paper "MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Hum…☆22Updated last month
- [ICCV 2023] PyTorch Implementation of "Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video"☆149Updated 4 months ago
- HTFormer: Human Topology Aware Transformer for 3D Human Pose Estimation☆173Updated 8 months ago
- [MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval☆135Updated 3 weeks ago
- [CVPR24] Official Implementation of 'A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video …☆112Updated 3 months ago
- FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation. (ICCV2023)☆113Updated 3 months ago
- 【ICCV'2023】What Can Simple Arithmetic Operations Do for Temporal Modeling?☆73Updated 7 months ago
- Colar: Effective and Efficient Online Action Detection by Consulting Exemplars, CVPR 2022.☆263Updated 2 years ago
- [ICCV23] Bird’s-Eye-View Scene Graph for Vision-Language Navigation☆114Updated 5 months ago
- Simple code demos about classic AIGC models/Compilation of blogs and papers on classic AIGC models.☆54Updated 2 weeks ago
- The code the CVPR2024 paper Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primit…☆101Updated 2 months ago
- Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, C…☆144Updated last week
- 【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective☆202Updated 3 months ago
- ☆166Updated last year
- A work list of recent human video generation method. This repository focus on half/full body human video generation method, The Nerf, Gau…☆174Updated last month
- Skeleton-based action recognition models in PyTorch, including Two-Stream CNN, HCN, HCN-Baseline, Ta-CNN and Dynamic GCN☆158Updated 2 years ago
- Official code for "A Closer Look at Audio-Visual Segmentation"☆107Updated last month
- HiWilliamWWL / Learn-to-Predict-How-Humans-Manipulate-Large-Sized-Objects-From-Interactive-Motions-objectsThis is the repo for the paper "Learn to Predict How Humans Manipulate Large-Sized Objects From Interactive Motions"☆25Updated 9 months ago
- Code for WS3DPT☆76Updated 3 months ago
- 【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?☆226Updated last week
- The official implementation of BackTAL, TPAMI 2021.☆218Updated 2 years ago
- 【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models☆156Updated last week
- GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?☆202Updated 3 months ago
- [TRIT 2024] Implementation of the paper “Explore Human Parsing Modality for Action Recognition”.☆33Updated 3 weeks ago
- ☆353Updated 2 months ago
- Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]☆106Updated 5 months ago
- [TMM 2024] Implementation of the paper “Temporal Decoupling Graph Convolutional Network for Skeleton-based Gesture Recognition”.☆50Updated last month
- This is the official reproduction of Qihoo-T2X.☆75Updated last week
- [ECCV 2024] InterFusion: Text-Driven Generation of 3D Human-Object Interaction☆49Updated 2 months ago