jmhessel / clipscore
CLIPScore EMNLP code
☆194Updated last year
Related projects ⓘ
Alternatives and complementary repositories for clipscore
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆185Updated 9 months ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆185Updated 2 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆125Updated last year
- TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering☆137Updated 6 months ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆202Updated last year
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering☆178Updated 10 months ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆260Updated 10 months ago
- Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models☆192Updated last year
- ☆113Updated last year
- ☆85Updated last year
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆269Updated 2 years ago
- Code for paper LAFITE: Towards Language-Free Training for Text-to-Image Generation (CVPR 2022)☆180Updated last year
- [Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation☆211Updated 2 weeks ago
- Densely Captioned Images (DCI) dataset repository.☆159Updated 4 months ago
- DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)☆137Updated 11 months ago
- An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.☆111Updated 2 years ago
- RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with t…☆105Updated 4 months ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆130Updated last year
- Official repository for the A-OKVQA dataset☆64Updated 6 months ago
- ☆73Updated 7 months ago
- An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"☆136Updated 7 months ago
- SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation☆95Updated 9 months ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆184Updated last year
- [NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models☆156Updated last year
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆125Updated last year
- ☆102Updated last year
- Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"☆73Updated last year
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆50Updated this week
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆280Updated last year
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆121Updated 7 months ago