tuhinjubcse / VisualMetaphors
Code and Data for ACL 2023 paper I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
☆12Updated last year
Alternatives and similar repositories for VisualMetaphors
Users that are interested in VisualMetaphors are comparing it to the libraries listed below
Sorting:
- IRFL: Image Recognition of Figurative Language☆11Updated last year
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆38Updated 6 months ago
- Preference Learning for LLaVA☆44Updated 6 months ago
- Source code for paper: "AltDiffusion: A multilingual Text-to-Image diffusion model"☆39Updated last year
- ☆30Updated last year
- Code, data, models for the Sherlock corpus☆57Updated 2 years ago
- ☆35Updated last year
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆56Updated last year
- How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?☆13Updated last year
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆44Updated last year
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆86Updated 2 years ago
- ☆11Updated 3 weeks ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆30Updated last year
- CLIPScore EMNLP code☆222Updated 2 years ago
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆87Updated 5 months ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆28Updated 10 months ago
- ☆24Updated last year
- ☆48Updated last year
- ☆18Updated 9 months ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding☆30Updated 2 weeks ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆57Updated last month
- The official github repo for MixEval-X, the first any-to-any, real-world benchmark.☆14Updated 3 months ago
- ☆22Updated 9 months ago
- The SVO-Probes Dataset for Verb Understanding☆31Updated 3 years ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆44Updated 11 months ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆44Updated 8 months ago
- Official implementation of the paper The Hidden Language of Diffusion Models☆72Updated last year