Generate text captions for images from their embeddings.
☆119Aug 1, 2023Updated 2 years ago
Alternatives and similar repositories for clip-text-decoder
Users that are interested in clip-text-decoder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Corpus and code for Aligned Recipe Actions (ARA) corpus, EMNLP 2021☆10May 22, 2024Updated 2 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆143Mar 16, 2023Updated 3 years ago
- Image Captioning using combination of object detection via YOLOv5 and Encoder Decoder LSTM model☆15Oct 13, 2022Updated 3 years ago
- [ECCV2022] Source Code for "Improving GANs for Long-Tailed Data through Group Spectral Regularization"☆16Oct 2, 2022Updated 3 years ago
- A pytorch implementation of Attention Is All You Need (Transformer) for image captioning.☆12Nov 15, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆11Oct 2, 2024Updated last year
- A Pytorch implementation of the paper 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering'☆10Jan 20, 2020Updated 6 years ago
- implementation of paper https://arxiv.org/abs/2210.04559☆57Nov 26, 2025Updated 6 months ago
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆277Sep 17, 2022Updated 3 years ago
- ☆12Sep 19, 2021Updated 4 years ago
- t-vMF Similarity for Regularizing Intra-Class Feature Distribution☆21Jun 11, 2021Updated 5 years ago
- X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024☆11Nov 7, 2024Updated last year
- Continuous diffusion for layout generation☆56Feb 19, 2025Updated last year
- DDSP-FM: a differentiable FM synth based on Magenta's DDSP library.☆22Jun 14, 2021Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated 2 years ago
- Simple image captioning model☆1,420Jun 9, 2024Updated 2 years ago
- Using LLMs and pre-trained caption models for super-human performance on image captioning.☆42Oct 13, 2023Updated 2 years ago
- Official implementation of "Perturbed-Attention Guidance"☆60Jul 2, 2024Updated last year
- [BMVC2024] Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning☆14May 21, 2026Updated 3 weeks ago
- Synthesis of percussion sounds using sinusoidal modelling, DDSP noise synthesis, and a neural source filter approach.☆34Jan 7, 2025Updated last year
- Codes and scripts for "Explainable Semantic Space by Grounding Languageto Vision with Cross-Modal Contrastive Learning"☆20Mar 23, 2022Updated 4 years ago
- CLIP is an open source, multimodal computer vision model and it's awesome!☆17Dec 16, 2024Updated last year
- Code for the paper "Multi-Task Learning of Object States and State-Modifying Actions from Web Videos" published in TPAMI☆11Mar 3, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- InstructionGPT-4☆42Dec 29, 2023Updated 2 years ago
- ☆16Jan 3, 2023Updated 3 years ago
- Official implementation of OSSGAN [CVPR 2022]☆21May 2, 2022Updated 4 years ago
- Music Demixing Challenge Submission Repo☆16Sep 8, 2023Updated 2 years ago
- Frozen Pretrained Transformers for Neural Sign Language Translation☆15Apr 23, 2022Updated 4 years ago
- Official implemention for Diffusion Models Are Innate One-Step Generators☆26Jun 25, 2025Updated 11 months ago
- ☆20May 3, 2025Updated last year
- ☆59Aug 30, 2023Updated 2 years ago
- ☆22Sep 13, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is tensorflow 2.2 based SCAMET framework for remote sensing image captioning.☆13Aug 10, 2023Updated 2 years ago
- Code for ICLR 2023 Paper, "Stable Target Field for Reduced Variance Score Estimation in Diffusion Models”☆76Jun 6, 2023Updated 3 years ago
- Using pretrained encoder and language models to generate captions from multimedia inputs.☆100Mar 11, 2023Updated 3 years ago
- Official repo for arxiv paper "Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion I…☆17Nov 8, 2024Updated last year
- ☆16May 7, 2023Updated 3 years ago
- BEAR: a new BEnchmark on video Action Recognition☆46Apr 21, 2024Updated 2 years ago
- Deliberate Attention Networks for Image Captioning (AAAI 2019)☆11Sep 30, 2019Updated 6 years ago