Wusiwei0410 / SciMMIR
☆22Updated last month
Related projects: ⓘ
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆22Updated 2 weeks ago
- ☆31Updated 3 months ago
- ☆11Updated 2 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆28Updated 5 months ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆52Updated 2 months ago
- Code and data for "Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue" (ACL 2024)☆20Updated last month
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆92Updated 2 months ago
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆49Updated last month
- [ACL 2024] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module …☆22Updated 2 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆62Updated 7 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆48Updated 4 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆51Updated 3 months ago
- ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.☆101Updated last week
- Visual and Embodied Concepts evaluation benchmark☆21Updated 11 months ago
- ☆13Updated 10 months ago
- 🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)☆63Updated 9 months ago
- An automatic MLLM hallucination detection framework☆17Updated 11 months ago
- Data for evaluating GPT-4V☆11Updated 10 months ago
- ☆73Updated 8 months ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆28Updated last year
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆32Updated 10 months ago
- ☆13Updated 10 months ago
- ☆44Updated 2 weeks ago
- Codebase for ACL 2023 paper "Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memori…☆44Updated 11 months ago
- Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original c…☆10Updated 2 weeks ago
- ☆49Updated last year
- [NeurIPS2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆28Updated last year
- Code for Findings of EMNLP2023 paper "Coarse-to-Fine Dual Encoders are Better Frame Identification Learners"☆12Updated 11 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆21Updated 3 months ago
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆75Updated last month