PhoebusSi / Thinking-while-ObservingLinks

Code for our ACL-2023 paper: "Combo of Thinking and Observing for Outside-Knowledge VQA"

☆12

Alternatives and similar repositories for Thinking-while-Observing

Users that are interested in Thinking-while-Observing are comparing it to the libraries listed below

Sorting:

PhoebusSi / VQA-VS
Code for our EMNLP-2022 paper: "Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA"
☆40Updated 2 years ago
PaulLerner / ViQuAE
Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…
☆38Updated 10 months ago
phellonchen / awesome-visual-dialog
Recent Advances in Visual Dialog
☆30Updated 3 years ago
kugwzk / DiDE
Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”
☆31Updated 2 years ago
AndersonStra / MuKEA
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
☆98Updated 2 years ago
limanling / clip-event
☆106Updated 3 years ago
OpenMatch / UniVL-DR
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…
☆53Updated last year
WeiminXiong / RationaleCL
Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)
☆12Updated 2 years ago
open-vision-language / oven
☆40Updated 2 years ago
MichaelZhouwang / VLUE
This repo contains codes and instructions for baselines in the VLUE benchmark.
☆41Updated 3 years ago
edchengg / infoseek_eval
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆25Updated last year
allenai / multimodalqa
☆142Updated 3 years ago
hackerchenzhuo / LaKo
[Paper][IJCKG 2022] LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection
☆26Updated last year
JetRunner / SuperICL
Code for "Small Models are Valuable Plug-ins for Large Language Models"
☆131Updated 2 years ago
Victorwz / VaLM
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Updated 2 years ago
OhadRubin / EPR
☆64Updated 2 years ago
liujch1998 / vera
☆16Updated 2 years ago
xiami2019 / CLAIF
[Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback
☆40Updated 2 years ago
WebQnA / WebQA
☆57Updated 9 months ago
TobiasLee / VEC
Visual and Embodied Concepts evaluation benchmark
☆21Updated 2 years ago
PhoebusSi / SAR
Code for our ACL2021 paper: "Check It Again: Progressive Visual Question Answering via Visual Entailment"
☆31Updated 3 years ago
Wusiwei0410 / SciMMIR
☆24Updated last year
ZUCC-AI / UMIE
Code and model for AAAI 2024: UMIE: Unified Multimodal Information Extraction with Instruction Tuning
☆41Updated last year
Zhiquan-Wen / D-VQA
PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)
☆25Updated 3 years ago
morningmoni / UniPELT
Code for paper "UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning", ACL 2022
☆63Updated 3 years ago
BenfengXu / KNNPrompting
Released code for our ICLR23 paper.
☆66Updated 2 years ago
guilk / KAT
Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"
☆68Updated 3 years ago
ychen-stat-ml / kernel-adapters
Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…
☆11Updated 2 years ago
X-PLUG / mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
☆98Updated last year
dqxiu / KAssess
☆14Updated 2 years ago