tuhinjubcse / VisualMetaphors
Code and Data for ACL 2023 paper I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
☆11Updated last year
Alternatives and similar repositories for VisualMetaphors:
Users that are interested in VisualMetaphors are comparing it to the libraries listed below
- Corpus to accompany: "Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest"☆54Updated 2 weeks ago
- How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?☆13Updated last year
- Preference Learning for LLaVA☆41Updated 4 months ago
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆44Updated last year
- ☆30Updated last year
- Code, data, models for the Sherlock corpus☆57Updated 2 years ago
- ☆33Updated last year
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆13Updated 8 months ago
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆12Updated last month
- [EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation☆12Updated last year
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- Codes for paper "Stylized Story Generation with Style-Guided Planning"☆13Updated 3 years ago
- [ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"☆13Updated last year
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆35Updated 7 months ago
- ☆20Updated 8 months ago
- This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language M…☆13Updated 5 months ago
- ☆35Updated last year
- Code and dataset for NAACL 2022 paper "CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination" Hyounghun Kim, Abhay Zala, Mohi…☆15Updated 2 years ago
- The KiloGram Tangrams dataset☆55Updated last year
- The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models☆12Updated 5 months ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆30Updated last year
- Official repo for "Imagination-Augmented Natural Language Understanding", NAACL 2022.☆17Updated 2 years ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆38Updated this week
- ☆15Updated 7 months ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆44Updated 7 months ago
- ☆48Updated last year
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆43Updated last year
- Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).☆28Updated 3 years ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆38Updated 6 months ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆43Updated 9 months ago