CLIPxGPT Captioner is Image Captioning Model based on OpenAI's CLIP and GPT-2.
☆118Feb 17, 2025Updated last year
Alternatives and similar repositories for clip-gpt-captioning
Users that are interested in clip-gpt-captioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Differentiable Patch Selection☆15Feb 20, 2023Updated 3 years ago
- An up-to-date & curated list of awesome layout to image papers, methods & resources.☆13Jun 28, 2024Updated 2 years ago
- Retrieval-augmented Image Captioning☆13Feb 16, 2023Updated 3 years ago
- Simple image captioning model☆1,422Jun 9, 2024Updated 2 years ago
- This repository contains the code and datasets for our ICCV-W paper 'Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts…☆30Feb 21, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 基于ClipCap的看图说话Image Caption模型☆324Apr 1, 2022Updated 4 years ago
- ☆31May 26, 2025Updated last year
- ☆20May 3, 2025Updated last year
- ☆11May 5, 2024Updated 2 years ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆199May 9, 2023Updated 3 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆209Jan 28, 2024Updated 2 years ago
- Convolutional Neural Networks, fork with trained NN for aerial car detection☆18May 23, 2018Updated 8 years ago
- Timecode and chapter generator for YouTube videos based on heuristic scene detection and CLIPxGPT Captioner☆12Mar 13, 2025Updated last year
- LLM-based character segmentation agent for ComfyUI based on SAM 3 and the SAM 3 Agent notebook☆28Dec 22, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Data repository for the VALSE benchmark.☆39Feb 15, 2024Updated 2 years ago
- Pytorch implementation of "ST360IQ: NO-REFERENCE OMNIDIRECTIONAL IMAGE QUALITY ASSESSMENT WITH SPHERICAL VISION TRANSFORMERS"☆14May 12, 2023Updated 3 years ago
- Saliency prediction on 360° image with SalGAN☆16Jan 5, 2021Updated 5 years ago
- ☆47Oct 5, 2025Updated 9 months ago
- ☆12Nov 6, 2024Updated last year
- COMIC: This is the code repo of our TMM2019 work titled "COMIC: Towards a Compact Image Captioning Model with Attention".☆15Jun 22, 2021Updated 5 years ago
- ☆18Oct 5, 2024Updated last year
- Official Code for GazeGNN: A Gaze-guided Graph Neural Network for Chest X-ray Classification [WACV 2024]☆21Aug 25, 2023Updated 2 years ago
- An automatic MLLM hallucination detection framework☆19Sep 26, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A Comprehensive Open-Source Toolkit for Neural Image Compression and Robustness Analysis☆51Jul 15, 2025Updated 11 months ago
- NICE challenge 2023 Track2 2nd result(total 4th) (CVPR 2023) sponsered by LG AI/Shutterstock/SNU☆11Jun 22, 2023Updated 3 years ago
- This repository is for the paper "Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding…☆21Nov 2, 2023Updated 2 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Jun 10, 2025Updated last year
- This repository contains code for CVPR 2019 paper "Efficient Video Classification Using Fewer Frames"☆19Mar 10, 2021Updated 5 years ago
- [ICCV 2025] The official pytorch implement of "LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs".☆24Oct 28, 2025Updated 8 months ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆144Mar 16, 2023Updated 3 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 3 years ago
- Image Captioning using LSTM and Deep Learning on Flickr8K dataset.☆15Feb 1, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Deferring loading of JS files until after React loads☆10Dec 4, 2022Updated 3 years ago
- A large-scale benchmark for the evaluation of embeddings across a number of fine-grained and instance-level visual domains.☆17Jun 14, 2024Updated 2 years ago
- CVPR-NTIRE 2025 Challenge on UGC Video Enhancement☆23May 30, 2025Updated last year
- a pytorch implementation of pensieve (https://github.com/hongzimao/pensieve)☆21Dec 24, 2019Updated 6 years ago
- ☆16Mar 9, 2023Updated 3 years ago
- Code and data for the COLING 2020 paper "Try to Substitute: An Unsupervised Chinese Word Sense Disambiguation Method Based on HowNet"☆14Dec 2, 2020Updated 5 years ago
- Code for VCRNet: Visual Compensation Restoration Network for No-Reference Image Quality Assessment☆25Apr 12, 2023Updated 3 years ago