TheoCoombes / ClipCapLinks

Using pretrained encoder and language models to generate captions from multimedia inputs.

☆97

Alternatives and similar repositories for ClipCap

Users that are interested in ClipCap are comparing it to the libraries listed below

Sorting:

MIMICLab / L-Verse
L-Verse: Bidirectional Generation Between Image and Text
☆109Updated 6 months ago
iejMac / clip-video-encode
Easily compute clip embeddings from video frames
☆147Updated 2 years ago
j-min / DallEval
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
☆142Updated 4 months ago
lucidrains / retrieval-augmented-ddpm
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch
☆65Updated 3 years ago
j-min / CLIP-Caption-Reward
PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)
☆246Updated 4 months ago
facebookresearch / SIMAT
codebase for the SIMAT dataset and evaluation
☆38Updated 3 years ago
afiaka87 / glide-finetune
Finetune glide-text2im from openai on your own data.
☆89Updated 3 weeks ago
LAION-AI / temporal-embedding-aggregation
Aggregating embeddings over time
☆32Updated 2 years ago
redcaps-dataset / redcaps-downloader
Command-line tool for downloading and extending the RedCaps dataset.
☆49Updated last year
LAION-AI / video-clip
Let's make a video clip
☆95Updated 3 years ago
Zasder3 / train-CLIP-FT
☆48Updated 4 years ago
rom1504 / embedding-reader
Efficiently read embedding in streaming from any filesystem
☆102Updated 2 months ago
google-research / xmcgan_image_generation
☆97Updated 2 months ago
joaanna / disentangling_spelling_in_clip
☆34Updated 2 years ago
CompVis / imagebart
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis
☆125Updated 3 years ago
mlfoundations / imagenet-captions
Release of ImageNet-Captions
☆51Updated 2 years ago
pbaylies / clustering-laion400m
Script and models for clustering LAION-400m CLIP embeddings.
☆26Updated 3 years ago
lucidrains / flexible-diffusion-modeling-videos-pytorch
Implementation of the video diffusion model and training scheme presented in the paper, Flexible Diffusion Modeling of Long Videos, in Py…
☆85Updated 3 years ago
google-research-datasets / videoCC-data
VideoCC is a dataset containing (video-URL, caption) pairs for training video-text machine learning models. It is created using an automa…
☆78Updated 2 years ago
xuewyang / Fashion_Captioning
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.
☆85Updated 2 years ago
wade3han / champagne
An official codebase for paper " CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"
☆52Updated 2 years ago
ml-jku / cloob
☆160Updated 3 years ago
fkodom / clip-text-decoder
Generate text captions for images from their embeddings.
☆115Updated 2 years ago
LAION-AI / General-GPT
☆65Updated 2 years ago
tgisaturday / dalle-lightning
Refactoring dalle-pytorch and taming-transformers for TPU VM
☆60Updated 4 years ago
weiyx16 / CLIP-pytorch
A non-JIT version implementation / replication of CLIP of OpenAI in pytorch
☆34Updated 4 years ago
ryanwebster90 / snip-dedup
☆103Updated last year
patil-suraj / vit-vqgan
JAX implementation ViT-VQGAN
☆82Updated 3 years ago
crowsonkb / cloob-training
CLOOB training (JAX) and inference (JAX and PyTorch)
☆74Updated 3 years ago
yxuansu / MAGIC
Language Models Can See: Plugging Visual Controls in Text Generation
☆259Updated 3 years ago