dhg-wei/DeCap

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dhg-wei/DeCap)

dhg-wei / DeCap

ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning

☆144

Alternatives and similar repositories for DeCap

Users that are interested in DeCap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

junyangwang0410 / Knight
View on GitHub
SotA text-only image/video method (IJCAI 2023)
☆15Jan 9, 2024Updated 2 years ago
DavidHuji / CapDec
View on GitHub
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆209Jan 28, 2024Updated 2 years ago
dhg-wei / TOPA
View on GitHub
(NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment
☆29Sep 27, 2024Updated last year
YoadTew / zero-shot-video-to-text
View on GitHub
☆75Oct 22, 2022Updated 3 years ago
allenai / close
View on GitHub
☆59Aug 30, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yangbang18 / MultiCapCLIP
View on GitHub
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆36Aug 8, 2024Updated last year
mzhaoshuai / CenterCLIP
View on GitHub
[SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval.
☆134May 4, 2022Updated 4 years ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
boreng0817 / IFCap
View on GitHub
[EMNLP 2024] IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
☆15May 13, 2025Updated last year
VamosC / CoLearning-meet-StitchUp
View on GitHub
[TIP 2023] Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition.
☆13Aug 19, 2023Updated 2 years ago
YoadTew / zero-shot-image-to-text
View on GitHub
Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
☆279Sep 17, 2022Updated 3 years ago
FeiElysia / ViECap
View on GitHub
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023
☆167Sep 9, 2024Updated last year
joeyz0z / ConZIC
View on GitHub
Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"
☆76Sep 20, 2023Updated 2 years ago
leonnnop / VAR
View on GitHub
[CVPR 2022] Visual Abductive Reasoning
☆124Oct 22, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
aimagelab / pacscore
View on GitHub
[CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
☆66Jul 29, 2025Updated 11 months ago
rmokady / CLIP_prefix_caption
View on GitHub
Simple image captioning model
☆1,421Jun 9, 2024Updated 2 years ago
NNNNAI / Ego4d_NLQ_2022_1st_Place_Solution
View on GitHub
The 1st place solution of 2022 Ego4d Natural Language Queries.
☆32Sep 5, 2022Updated 3 years ago
WangWenhao0716 / ASL
View on GitHub
[AAAI 2023] The official implementation of "A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection"
☆22Jan 24, 2025Updated last year
yuhui-zh15 / C3
View on GitHub
Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)
☆36Oct 16, 2024Updated last year
liupeng0606 / clip4caption
View on GitHub
The first unofficial implementation of CLIP4Caption: CLIP for Video Caption (ACMMM 2021)
☆16Jan 2, 2023Updated 3 years ago
NovaMind-Z / PTSN
View on GitHub
Repository for an end-to-end image captioning method PTSN(ACM MM22).
☆60Dec 11, 2022Updated 3 years ago
RitaRamo / extra
View on GitHub
Retrieval-augmented Image Captioning
☆13Feb 16, 2023Updated 3 years ago
fkodom / clip-text-decoder
View on GitHub
Generate text captions for images from their embeddings.
☆119Aug 1, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Yaojie-Shen / CoCap
View on GitHub
[ICCV 2023] Accurate and Fast Compressed Video Captioning
☆52Jul 28, 2025Updated 11 months ago
ttengwang / PDVC
View on GitHub
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
☆230Jan 3, 2024Updated 2 years ago
joeyz0z / MeaCap
View on GitHub
(CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning
☆56Aug 16, 2024Updated last year
cyysc1998 / 3D-Pretraining
View on GitHub
Self-supervised Point Cloud Representation Learning via Separating Mixed Shapes
☆21May 23, 2023Updated 3 years ago
LHL3341 / ContextBLIP
View on GitHub
ContextBLIP : Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions [ACL 2024]
☆11May 17, 2024Updated 2 years ago
RitaRamo / smallcap
View on GitHub
SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation
☆125Feb 13, 2024Updated 2 years ago
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
minghangz / SPL
View on GitHub
Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization
☆16Jul 20, 2023Updated 3 years ago
ml-jku / semantic-image-text-alignment
View on GitHub
☆25Jul 10, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mshukor / eP-ALM
View on GitHub
[ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.
☆27Oct 27, 2023Updated 2 years ago
baaaad / ECE
View on GitHub
[ECCV'22 Poster] Explicit Image Caption Editing
☆22Nov 30, 2022Updated 3 years ago
sxl142 / GLoT
View on GitHub
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
☆59Jun 21, 2023Updated 3 years ago
Dreamer312 / SEED-GRPO
View on GitHub
The official repository of SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization
☆159Jan 29, 2026Updated 5 months ago
layumi / HQ-Market
View on GitHub
Market-1501 dataset with super-resolution quality
☆21May 12, 2022Updated 4 years ago
salesforce / ALPRO
View on GitHub
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
☆188May 1, 2025Updated last year
232525 / PureT
View on GitHub
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
☆70Jun 1, 2024Updated 2 years ago