hengRUC / VSP
☆22Updated last year
Related projects ⓘ
Alternatives and complementary repositories for VSP
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆48Updated last year
- [ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts☆11Updated last year
- Temporal Alignment Representations with Contrastive Learning☆22Updated last year
- Official Implementation of SnAG (CVPR 2024)☆37Updated 3 weeks ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆42Updated 4 months ago
- (CVPR 2023) Official implemention of the paper "Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos…☆28Updated 7 months ago
- [CVPR 2024] - Official code for the paper "Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation"☆24Updated 3 months ago
- Official implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024☆43Updated 2 months ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆45Updated 5 months ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆39Updated last year
- Code for Static and Dynamic Concepts for Self-supervised Video Representation Learning.☆10Updated 2 years ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆110Updated last year
- [AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.☆31Updated last month
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval"☆12Updated 2 months ago
- Code for Diffusion Action Segmentation (ICCV 2023)☆52Updated last year
- Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.☆52Updated 2 months ago
- Source code of our CVPR2024 paper TeachCLIP for Text-to-Video Retrieval☆22Updated 3 weeks ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆59Updated 4 months ago
- Official implementation of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video" (ECCV2024)☆18Updated 3 months ago
- ☆30Updated 2 years ago
- ☆21Updated last month
- (CVPR2024) Realigning Confidence with Temporal Saliency Information for Point-level Weakly-Supervised Temporal Action Localization☆18Updated 5 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆37Updated 3 months ago
- [CVPR 2023 Highlight] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning☆109Updated 7 months ago
- HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos☆16Updated 8 months ago
- Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentatio…☆35Updated 5 months ago
- Code for our CVPR 2023 paper "MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition".☆39Updated 8 months ago
- ☆47Updated 2 years ago
- [ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval☆28Updated 2 months ago