Aman-4-Real / See-or-GuessLinks
[ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning
☆14Updated 4 months ago
Alternatives and similar repositories for See-or-Guess
Users that are interested in See-or-Guess are comparing it to the libraries listed below
Sorting:
- [ACM MM 2022]: Multi-Modal Experience Inspired AI Creation☆20Updated 7 months ago
- Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆22Updated last week
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆23Updated last year
- a multimodal retrieval dataset☆24Updated last year
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- [ICLR 2024] Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond☆20Updated last year
- ☆27Updated 3 years ago
- Paper, dataset and code list for multimodal dialogue.☆21Updated 5 months ago
- ☆22Updated 10 months ago
- About Codes for ACL 2023 paper: Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling.☆17Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆75Updated 7 months ago
- ☆91Updated 2 years ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆28Updated 3 weeks ago
- ☆51Updated last year
- ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without rely…☆51Updated last year
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆40Updated 3 weeks ago
- ☆9Updated 4 years ago
- ☆25Updated 2 years ago
- [Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics☆39Updated 5 months ago
- CHAIR metric is a rule-based metric for evaluating object hallucination in caption generation.