ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning
☆138Mar 16, 2023Updated 2 years ago
Alternatives and similar repositories for DeCap
Users that are interested in DeCap are comparing it to the libraries listed below
Sorting:
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆203Jan 28, 2024Updated 2 years ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- SotA text-only image/video method (IJCAI 2023)☆16Jan 9, 2024Updated 2 years ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- The official code for [ACM MM 2022] 'In-N-Out Generative Learning for Dense Unsupervised Video Segmentation'.☆20Feb 22, 2023Updated 3 years ago
- ☆59Aug 30, 2023Updated 2 years ago
- [SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval.☆134May 4, 2022Updated 3 years ago
- ☆76Oct 22, 2022Updated 3 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆36Aug 8, 2024Updated last year
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆278Sep 17, 2022Updated 3 years ago
- [CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation☆65Jul 29, 2025Updated 7 months ago
- [TIP 2023] Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition.☆13Aug 19, 2023Updated 2 years ago
- The 1st place solution of 2022 Ego4d Natural Language Queries.☆32Sep 5, 2022Updated 3 years ago
- [EMNLP 2024] IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning☆15May 13, 2025Updated 9 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆55Aug 16, 2024Updated last year
- Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023☆162Sep 9, 2024Updated last year
- Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"☆75Sep 20, 2023Updated 2 years ago
- Code for Point-Calibrated Spectral Neural Operators☆20Oct 15, 2024Updated last year
- [CVPR 2022] Visual Abductive Reasoning☆124Oct 22, 2024Updated last year
- ☆17Dec 13, 2023Updated 2 years ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Oct 27, 2023Updated 2 years ago
- Simple image captioning model☆1,408Jun 9, 2024Updated last year
- ☆11May 17, 2024Updated last year
- Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes☆21May 23, 2023Updated 2 years ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆34Oct 16, 2024Updated last year
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 10 months ago
- [ICCV 2023] Accurate and Fast Compressed Video Captioning☆52Jul 28, 2025Updated 7 months ago
- Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"☆13Jun 17, 2024Updated last year
- Project for SNARE benchmark☆11Jun 5, 2024Updated last year
- Original code base for On Pretraining Data Diversity for Self-Supervised Learning☆14Dec 30, 2024Updated last year
- The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".☆12Oct 17, 2023Updated 2 years ago
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)☆229Jan 3, 2024Updated 2 years ago
- SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation☆126Feb 13, 2024Updated 2 years ago
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆16May 8, 2025Updated 9 months ago
- Retrieval-augmented Image Captioning☆13Feb 16, 2023Updated 3 years ago
- ☆47Sep 13, 2024Updated last year
- (ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest☆551Jun 3, 2025Updated 9 months ago