zhiyuanhubj / Long_form_VideoQA
[EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering
☆16Updated 3 months ago
Alternatives and similar repositories for Long_form_VideoQA:
Users that are interested in Long_form_VideoQA are comparing it to the libraries listed below
- [ACL’24 Findings] Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives☆36Updated 5 months ago
- ☆34Updated 2 months ago
- Code for ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context☆15Updated 2 months ago
- ☆67Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆59Updated 2 months ago
- ☆29Updated last year
- ☆16Updated last year
- ☆25Updated last year
- ☆39Updated 3 months ago
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆31Updated last month
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆106Updated 4 months ago
- ☆38Updated last year
- visual question answering prompting recipes for large vision-language models☆24Updated 4 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆41Updated 3 months ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆30Updated last year
- Source code for InBedder, an instruction-following text embedder☆24Updated 3 months ago
- Findings of EMNLP 2023: InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspe…☆14Updated 5 months ago
- Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"☆63Updated 2 years ago
- ☆11Updated 7 months ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆35Updated 4 months ago
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆97Updated last year
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆13Updated 9 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆39Updated 3 months ago
- ☆30Updated 9 months ago
- ☆12Updated last year
- Official reposity for paper "High-Dimension Human Value Representation in Large Language Models"☆21Updated 6 months ago
- ☆16Updated last year
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆11Updated 6 months ago
- ☆72Updated 8 months ago
- Evaluating the Ripple Effects of Knowledge Editing in Language Models☆53Updated 9 months ago