CLIPxGPT Captioner is Image Captioning Model based on OpenAI's CLIP and GPT-2.
☆118Feb 17, 2025Updated last year
Alternatives and similar repositories for clip-gpt-captioning
Users that are interested in clip-gpt-captioning are comparing it to the libraries listed below
Sorting:
- An up-to-date & curated list of awesome layout to image papers, methods & resources.☆13Jun 28, 2024Updated last year
- Retrieval-augmented Image Captioning☆13Feb 16, 2023Updated 3 years ago
- Simple image captioning model☆1,413Jun 9, 2024Updated last year
- Public demos using the Cohere platform!☆11May 24, 2023Updated 2 years ago
- This repository contains the code and datasets for our ICCV-W paper 'Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts…☆30Feb 21, 2024Updated 2 years ago
- 基于ClipCap的看图说话Image Caption模型☆321Apr 1, 2022Updated 3 years ago
- ☆20May 3, 2025Updated 10 months ago
- ☆12May 5, 2024Updated last year
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆198May 9, 2023Updated 2 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆203Jan 28, 2024Updated 2 years ago
- Convolutional Neural Networks, fork with trained NN for aerial car detection☆18May 23, 2018Updated 7 years ago
- LLM-based character segmentation agent for ComfyUI based on SAM 3 and the SAM 3 Agent notebook☆25Dec 22, 2025Updated 2 months ago
- ☆45Oct 5, 2025Updated 5 months ago
- Saliency prediction on 360° image with SalGAN☆16Jan 5, 2021Updated 5 years ago
- Create your own DALL-E application in Python with Streamlit.☆12Mar 9, 2023Updated 3 years ago
- ☆12Nov 6, 2024Updated last year
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated last month
- Jin, Xiao, et al. "FCMNet: Frequency-aware cross-modality attention networks for RGB-D salient object detection." Neurocomputing 491 (202…☆11Apr 11, 2024Updated last year
- COMIC: This is the code repo of our TMM2019 work titled "COMIC: Towards a Compact Image Captioning Model with Attention".☆15Jun 22, 2021Updated 4 years ago
- Official Code for GazeGNN: A Gaze-guided Graph Neural Network for Chest X-ray Classification [WACV 2024]☆21Aug 25, 2023Updated 2 years ago
- An automatic MLLM hallucination detection framework☆19Sep 26, 2023Updated 2 years ago
- This repository is for the paper "Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding…☆21Nov 2, 2023Updated 2 years ago
- Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]☆69Jun 1, 2024Updated last year
- Codebase for the paper HawkI: HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View☆13Jun 5, 2024Updated last year
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Jun 10, 2025Updated 9 months ago
- This repository contains code for CVPR 2019 paper "Efficient Video Classification Using Fewer Frames"☆20Mar 10, 2021Updated 5 years ago
- Some papers about *diverse* image (a few videos) captioning☆26Apr 4, 2023Updated 2 years ago
- Extend BoxDiff to SDXL (SDXL-based layout-to-image generation)☆27May 23, 2024Updated last year
- Official code for "Automated Scoring for Reading Comprehension via In-context BERT Tuning" (AIED 2022)☆13May 23, 2022Updated 3 years ago
- AAAI-2024☆23Sep 18, 2025Updated 6 months ago
- Deferring loading of JS files until after React loads☆10Dec 4, 2022Updated 3 years ago
- A large-scale benchmark for the evaluation of embeddings across a number of fine-grained and instance-level visual domains.☆17Jun 14, 2024Updated last year
- Medical image captioning using OpenAI's CLIP☆95Mar 7, 2023Updated 3 years ago
- Some time series vectorization methods which could give better representation for classification / clustering or other analysis.☆11Jan 4, 2016Updated 10 years ago
- ☆15Mar 9, 2023Updated 3 years ago
- Code and data for the COLING 2020 paper "Try to Substitute: An Unsupervised Chinese Word Sense Disambiguation Method Based on HowNet"☆14Dec 2, 2020Updated 5 years ago
- LoLI-Street is a low-light image enhancement dataset for training and testing low-light image enhancement models under urban street scene…☆38Apr 30, 2025Updated 10 months ago
- ☆54Aug 3, 2023Updated 2 years ago
- Code for VCRNet: Visual Compensation Restoration Network for No-Reference Image Quality Assessment☆24Apr 12, 2023Updated 2 years ago