Using pretrained encoder and language models to generate captions from multimedia inputs.
☆100Mar 11, 2023Updated 3 years ago
Alternatives and similar repositories for ClipCap
Users that are interested in ClipCap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- Aim for the moon. If you miss, you may hit a star.☆164Feb 14, 2023Updated 3 years ago
- Refactoring dalle-pytorch and taming-transformers for TPU VM☆60Aug 30, 2021Updated 4 years ago
- ☆21Mar 15, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Jun 10, 2025Updated 9 months ago
- Let's make a video clip☆96Jul 29, 2022Updated 3 years ago
- Simple image captioning model☆1,414Jun 9, 2024Updated last year
- Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training☆169Apr 27, 2023Updated 2 years ago
- ☆112Aug 5, 2021Updated 4 years ago
- MusicGen conditioned with chord progression.☆11Oct 7, 2023Updated 2 years ago
- Efficiently read embedding in streaming from any filesystem☆105Aug 9, 2025Updated 7 months ago
- GLIDE: a diffusion-based text-conditional image synthesis model. Now with example files for local running.☆11Jan 25, 2022Updated 4 years ago
- This project provides a data set with bounding boxes, body poses, 3D face meshes & captions of people from our LAION-2.2B. Additionally i…☆14Jan 2, 2022Updated 4 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Aggregating embeddings over time☆32Jan 19, 2023Updated 3 years ago
- Dataset for Paper "Exploring Content Selection in Summarization of Novel Chapters"☆14Mar 20, 2023Updated 3 years ago
- Controllable mage captioning model with unsupervised modes☆21Apr 14, 2023Updated 2 years ago
- Inverts CLIP text embeds to image embeds and visualizes with deep-image-prior.☆35Jul 3, 2022Updated 3 years ago
- Home of `erlich` and `ongo`. Finetune latent-diffusion/glid-3-xl text2image on your own data.☆181Aug 5, 2022Updated 3 years ago
- Easily compute clip embeddings and build a clip retrieval system with them☆2,734Aug 15, 2025Updated 7 months ago
- Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning☆11Jul 20, 2022Updated 3 years ago
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆223May 26, 2024Updated last year
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆204Jan 28, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- OpenAI CLIP text encoders for multiple languages!☆828May 15, 2023Updated 2 years ago
- Official PyTorch implementation of StyleGAN3☆25Oct 15, 2021Updated 4 years ago
- Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action☆37Apr 3, 2023Updated 2 years ago
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆10Dec 12, 2023Updated 2 years ago
- StableDiffusion scripts based on huggingface diffusers.☆15Feb 23, 2025Updated last year
- Repository for "Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search"☆179Sep 30, 2021Updated 4 years ago
- ☆10Aug 25, 2019Updated 6 years ago
- Un-*** 50 billions multimodality dataset☆23Sep 14, 2022Updated 3 years ago
- Xfce Desktop container designed for direct access to the GPU with EGL using VirtualGL for GPUs. Does not require /tmp/.X11-unix host sock…☆10Jul 25, 2022Updated 3 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- CLOOB training (JAX) and inference (JAX and PyTorch)☆74May 16, 2022Updated 3 years ago
- Doing style transfer with linguistic features using OpenAI's CLIP.☆14May 4, 2021Updated 4 years ago
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISION☆36Nov 7, 2022Updated 3 years ago
- Majesty Diffusion by @Dango233 and @apolinario (@multimodalart)☆25Jul 26, 2022Updated 3 years ago
- ☆12Mar 16, 2022Updated 4 years ago
- Memory-efficient transformer. Work in progress.☆19Sep 17, 2022Updated 3 years ago
- ☆11Sep 7, 2020Updated 5 years ago