Aman-4-Real / See-or-GuessLinks
[ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning
☆14Updated 5 months ago
Alternatives and similar repositories for See-or-Guess
Users that are interested in See-or-Guess are comparing it to the libraries listed below
Sorting:
- [ACM MM 2022]: Multi-Modal Experience Inspired AI Creation☆20Updated 7 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆24Updated last year
- [ICLR 2024] Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond☆20Updated last year
- Paper, dataset and code list for multimodal dialogue.☆21Updated 6 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆155Updated 9 months ago
- a multimodal retrieval dataset☆24Updated 2 years ago
- ☆14Updated 8 months ago
- ☆22Updated 11 months ago
- [AAAI 2023] Official implementation of FiTs: Fine-grained Two-stage Training for Knowledge Base Question Answering☆11Updated 2 years ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Updated last year
- About Codes for ACL 2023 paper: Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling.☆17Updated last year
- ☆17Updated last year
- ☆27Updated 3 years ago
- ☆54Updated last year
- [Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics☆39Updated 5 months ago
- Official repository for the A-OKVQA dataset☆96Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆77Updated 8 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- The source code of ExFunTube☆10Updated last year
- ☆53Updated last year
- ☆78Updated last year
- ☆92Updated 2 years ago
- ☆39Updated last year
- [IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".☆56Updated last year
- ☆55Updated 6 months ago
- mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)☆94Updated 2 years ago
- [EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs☆12Updated 6 months ago
- ☆16Updated 2 years ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆39Updated 7 months ago
- A paper list about diffusion models for natural language processing.☆182Updated last year