CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆205Jan 28, 2024Updated 2 years ago
Alternatives and similar repositories for CapDec
Users that are interested in CapDec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆142Mar 16, 2023Updated 3 years ago
- ☆59Aug 30, 2023Updated 2 years ago
- Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023☆164Sep 9, 2024Updated last year
- Simple image captioning model☆1,417Jun 9, 2024Updated last year
- Language Models Can See: Plugging Visual Controls in Text Generation☆259Jun 1, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆277Sep 17, 2022Updated 3 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆30Dec 1, 2022Updated 3 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Jun 10, 2025Updated 10 months ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- SotA text-only image/video method (IJCAI 2023)☆15Jan 9, 2024Updated 2 years ago
- 基于ClipCap的看图说话Image Caption模型☆324Apr 1, 2022Updated 4 years ago
- A simple and effective feature extractor for untrimmed videos☆13Sep 1, 2022Updated 3 years ago
- METER: A Multimodal End-to-end TransformER Framework☆376Nov 16, 2022Updated 3 years ago
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆175Dec 14, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- implementation of paper https://arxiv.org/abs/2210.04559☆57Nov 26, 2025Updated 5 months ago
- ☆15Nov 19, 2020Updated 5 years ago
- ☆201May 10, 2023Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of "HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts"☆20Nov 22, 2024Updated last year
- ☆21Mar 19, 2024Updated 2 years ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆175Sep 26, 2022Updated 3 years ago
- ☆195Updated this week
- Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]☆139Apr 10, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆199May 9, 2023Updated 2 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 11 months ago
- [ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383☆419Oct 28, 2022Updated 3 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models (ACL 2022)☆42May 13, 2022Updated 3 years ago
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆105Aug 22, 2023Updated 2 years ago
- Cross Modal Retrieval with Querybank Normalisation☆57Nov 21, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"☆814Mar 20, 2024Updated 2 years ago
- ☆36Feb 5, 2024Updated 2 years ago
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆294Jun 7, 2023Updated 2 years ago
- aft - advanced file transfer.☆45Apr 15, 2026Updated 2 weeks ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"☆12Mar 26, 2026Updated last month
- Official Repository of ChatCaptioner☆468Apr 13, 2023Updated 3 years ago