Aman-4-Real / See-or-GuessLinks
[ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning
☆16Updated 11 months ago
Alternatives and similar repositories for See-or-Guess
Users that are interested in See-or-Guess are comparing it to the libraries listed below
Sorting:
- [ACM MM 2022]: Multi-Modal Experience Inspired AI Creation☆21Updated last year
- [ICLR 2024] Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond☆21Updated last year
- a multimodal retrieval dataset☆24Updated 2 years ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25Updated last year
- Official repository for the A-OKVQA dataset☆109Updated last year
- ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without rely…☆54Updated 2 years ago
- ☆101Updated 3 years ago
- ☆24Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆85Updated last year
- [AAAI 2023] Official implementation of FiTs: Fine-grained Two-stage Training for Knowledge Base Question Answering☆11Updated 2 years ago
- ☆24Updated 2 years ago
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆53Updated 6 months ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆168Updated 3 years ago
- About Codes for ACL 2023 paper: Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling.☆20Updated last year
- ☆27Updated 3 years ago
- ☆59Updated last year
- The Social-IQ 2.0 Challenge Release for the Artificial Social Intelligence Workshop at ICCV '23☆35Updated 2 years ago
- ☆19Updated 2 years ago
- Paper, dataset and code list for multimodal dialogue.☆22Updated last year
- [Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics☆37Updated last year
- [MM 2025] Towards Modality Generalization: A Benchmark and Prospective Analysis☆28Updated 8 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆156Updated last year
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning☆134Updated 2 years ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆177Updated last year
- ESPER☆24Updated last year
- ☆25Updated 6 months ago
- Code for our EMNLP-2022 paper: "Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA"☆41Updated 3 years ago
- [ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…☆53Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Updated 2 years ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆243Updated 5 months ago