naver-ai / tc-clip
[ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"
☆15Updated last month
Related projects: ⓘ
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆35Updated 11 months ago
- ☆30Updated 9 months ago
- ☆34Updated 5 months ago
- Official Pytorch Implementation of 'BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos'☆20Updated 9 months ago
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Updated 8 months ago
- ☆25Updated last year
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆36Updated 9 months ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆38Updated 3 months ago
- Composed Video Retrieval☆42Updated 4 months ago
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆32Updated 4 months ago
- Official Pytorch implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024☆38Updated last week
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆46Updated last year
- Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)☆36Updated 10 months ago
- [CVPR 2023] Learning Attention as Disentangler for Compositional Zero-shot Learning☆36Updated last year
- ☆32Updated 5 months ago
- ☆21Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆31Updated last month
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆104Updated last year
- ☆11Updated 3 weeks ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆38Updated 2 months ago
- ☆27Updated this week
- ☆10Updated last year
- [ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval☆16Updated 3 weeks ago
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆12Updated 3 months ago
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆36Updated 5 months ago
- Repository for the CVPR23 paper Re^2TAL☆12Updated 5 months ago
- Repo of NeurIPS23☆12Updated 10 months ago
- ☆60Updated last year
- 📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)☆45Updated 10 months ago
- ☆19Updated last year