Yuting-Gao / PyramidCLIP
Implementation of PyramidCLIP(NeurIPS2022).
☆30Updated 2 years ago
Alternatives and similar repositories for PyramidCLIP:
Users that are interested in PyramidCLIP are comparing it to the libraries listed below
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Updated 2 years ago
- [ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption☆97Updated last year
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆66Updated 4 months ago
- Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023☆48Updated last year
- Turning to Video for Transcript Sorting☆48Updated last year
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆88Updated last month
- SeqTR: A Simple yet Universal Network for Visual Grounding☆130Updated 3 months ago
- The official implementation of RAR☆81Updated 10 months ago
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆99Updated last year
- ☆89Updated last year
- ☆22Updated last year
- 📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)☆52Updated last year
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆55Updated last year
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Updated last year
- ☆58Updated last year
- ☆28Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆55Updated 8 months ago
- ☆65Updated 2 months ago
- ☆33Updated 7 months ago
- [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval☆38Updated last year
- ☆110Updated last year
- ☆89Updated last year
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- ICLR‘24 Offical Implementation of Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization☆71Updated last year
- ☆47Updated 2 years ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆56Updated 2 years ago
- ☆88Updated last year
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated 2 years ago
- Official implementation of TagAlign☆34Updated 2 months ago
- ☆56Updated 2 years ago