Generate text captions for images from their embeddings.
☆119Aug 1, 2023Updated 2 years ago
Alternatives and similar repositories for clip-text-decoder
Users that are interested in clip-text-decoder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆140Mar 16, 2023Updated 3 years ago
- Image Captioning using combination of object detection via YOLOv5 and Encoder Decoder LSTM model☆15Oct 13, 2022Updated 3 years ago
- [ECCV2022] Source Code for "Improving GANs for Long-Tailed Data through Group Spectral Regularization"☆16Oct 2, 2022Updated 3 years ago
- babyLM WhisBERT code☆19May 27, 2024Updated last year
- A pytorch implementation of Attention Is All You Need (Transformer) for image captioning.☆12Nov 15, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆11Oct 2, 2024Updated last year
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆277Sep 17, 2022Updated 3 years ago
- implementation of paper https://arxiv.org/abs/2210.04559☆57Nov 26, 2025Updated 4 months ago
- ☆12Sep 19, 2021Updated 4 years ago
- t-vMF Similarity for Regularizing Intra-Class Feature Distribution☆21Jun 11, 2021Updated 4 years ago
- X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024☆11Nov 7, 2024Updated last year
- S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions☆51May 26, 2023Updated 2 years ago
- Continuous diffusion for layout generation☆54Feb 19, 2025Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Simple image captioning model☆1,416Jun 9, 2024Updated last year
- WildLife Documentary Dataset☆14Jun 19, 2017Updated 8 years ago
- Using LLMs and pre-trained caption models for super-human performance on image captioning.☆42Oct 13, 2023Updated 2 years ago
- Official implementation of "Perturbed-Attention Guidance"☆60Jul 2, 2024Updated last year
- [BMVC2024] Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning☆14Feb 14, 2026Updated 2 months ago
- Codes and scripts for "Explainable Semantic Space by Grounding Languageto Vision with Cross-Modal Contrastive Learning"☆20Mar 23, 2022Updated 4 years ago
- CLIP is an open source, multimodal computer vision model and it's awesome!☆17Dec 16, 2024Updated last year
- Code for the paper "Multi-Task Learning of Object States and State-Modifying Actions from Web Videos" published in TPAMI☆11Mar 3, 2024Updated 2 years ago
- InstructionGPT-4☆42Dec 29, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official implementation of OSSGAN [CVPR 2022]☆21May 2, 2022Updated 3 years ago
- Official codebase for our paper "Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks"☆12Jun 30, 2021Updated 4 years ago
- Code for our ICLR'2022 paper "Generalizing Few-Shot NAS with Gradient Matching"☆22Oct 30, 2022Updated 3 years ago
- Style Transfer by Rigid Alignment in Neural Net Feature Space☆11Jan 23, 2021Updated 5 years ago
- ☆20May 3, 2025Updated 11 months ago
- ☆59Aug 30, 2023Updated 2 years ago
- ☆22Sep 13, 2021Updated 4 years ago
- Official code for the paper "Self-Distillation for Few-Shot Image Captioning"☆18Mar 15, 2021Updated 5 years ago
- Using pretrained encoder and language models to generate captions from multimedia inputs.☆100Mar 11, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆29Sep 25, 2021Updated 4 years ago
- Learned User Representations in Online Social Networks (Twitter) using Temporal Dynamics of Information Diffusion.☆10Oct 15, 2018Updated 7 years ago
- Visualizing data to better monitor issues around food security☆14Nov 28, 2024Updated last year
- Deliberate Attention Networks for Image Captioning (AAAI 2019)☆11Sep 30, 2019Updated 6 years ago
- A repo for shared Jupyter and Colab notebooks☆23Jul 4, 2025Updated 9 months ago
- Demos of neural image editing☆11Mar 15, 2021Updated 5 years ago
- The codebase for Inducing Causal Structure for Interpretable Neural Networks☆11Dec 3, 2021Updated 4 years ago