eujhwang / meme-cap
☆39Updated last year
Alternatives and similar repositories for meme-cap:
Users that are interested in meme-cap are comparing it to the libraries listed below
- Hate-CLIPper: Multimodal Hateful Meme Classification with Explicit Cross-modal Interaction of CLIP features - Accepted at EMNLP 2022 Work…☆49Updated 2 years ago
- Corpus to accompany: "Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest"☆54Updated last week
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning☆136Updated last year
- Code, data, models for the Sherlock corpus☆57Updated 2 years ago
- In-the-wild Question Answering☆15Updated last year
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 3 years ago
- Implementation of our ACL2023 paper: Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Langua…☆19Updated last year
- ☆12Updated 2 years ago
- [NeurIPS 2023] A faithful benchmark for vision-language compositionality☆77Updated last year
- Resources for cultural NLP research☆86Updated 2 months ago
- An Image/Text Retrieval Test Collection to Support Multimedia Content Creation☆20Updated last year
- Reading list for Multimodal Large Language Models☆68Updated last year
- Code and data for ImageCoDe, a contextual vison-and-language benchmark☆39Updated last year
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆55Updated 9 months ago
- ☆16Updated 2 years ago
- [ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"☆49Updated 2 years ago
- [ACL 2024] FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model☆14Updated 7 months ago
- This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision…☆26Updated last year
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆137Updated 5 months ago
- ☆26Updated 3 years ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆48Updated last year
- ☆19Updated this week
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆87Updated last year
- ☆83Updated last year
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆12Updated last month
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering☆187Updated last year
- A curated list of research papers and resources on Cultural LLM.☆41Updated 6 months ago
- Data and code for the paper "Inducing Positive Perspectives with Text Reframing"☆57Updated last year
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆368Updated last year
- ☆88Updated last year