CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆204Jan 28, 2024Updated 2 years ago
Alternatives and similar repositories for CapDec
Users that are interested in CapDec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆140Mar 16, 2023Updated 3 years ago
- ☆59Aug 30, 2023Updated 2 years ago
- Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023☆164Sep 9, 2024Updated last year
- Simple image captioning model☆1,416Jun 9, 2024Updated last year
- Language Models Can See: Plugging Visual Controls in Text Generation☆258Jun 1, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆277Sep 17, 2022Updated 3 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Jun 10, 2025Updated 10 months ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- SotA text-only image/video method (IJCAI 2023)☆15Jan 9, 2024Updated 2 years ago
- 基于ClipCap的看图说话Image Caption模型☆322Apr 1, 2022Updated 4 years ago
- A simple and effective feature extractor for untrimmed videos☆13Sep 1, 2022Updated 3 years ago
- METER: A Multimodal End-to-end TransformER Framework☆376Nov 16, 2022Updated 3 years ago
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆175Dec 14, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- implementation of paper https://arxiv.org/abs/2210.04559☆57Nov 26, 2025Updated 4 months ago
- ☆15Nov 19, 2020Updated 5 years ago
- ☆200May 10, 2023Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of "HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts"☆20Nov 22, 2024Updated last year
- ☆21Mar 19, 2024Updated 2 years ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆173Sep 26, 2022Updated 3 years ago
- ☆195Mar 5, 2025Updated last year
- Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]☆138Sep 29, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆199May 9, 2023Updated 2 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 11 months ago
- [ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383☆419Oct 28, 2022Updated 3 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models (ACL 2022)☆42May 13, 2022Updated 3 years ago
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆105Aug 22, 2023Updated 2 years ago
- Cross Modal Retrieval with Querybank Normalisation☆57Nov 21, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"☆809Mar 20, 2024Updated 2 years ago
- ☆36Feb 5, 2024Updated 2 years ago
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆294Jun 7, 2023Updated 2 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- aft - advanced file transfer.☆45Feb 27, 2026Updated last month
- Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"☆12Mar 26, 2026Updated 2 weeks ago
- Official Repository of ChatCaptioner☆468Apr 13, 2023Updated 2 years ago