GraphPKU / CoILinks
Chain of Images for Intuitively Reasoning
☆9Updated last year
Alternatives and similar repositories for CoI
Users that are interested in CoI are comparing it to the libraries listed below
Sorting:
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆68Updated last year
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆43Updated 2 years ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆44Updated last month
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 7 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding☆32Updated last month
- ☆27Updated last year
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Updated last year
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆23Updated 2 months ago
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆19Updated last year
- [ACL 2023] Delving into the Openness of CLIP☆23Updated 2 years ago
- Official Code for ACL 2023 Outstanding Paper: World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Languag…☆32Updated last year
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆32Updated last year
- Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"☆64Updated 2 years ago
- Counterfactual Reasoning VQA Dataset☆25Updated last year
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Updated last year
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated last week
- ☆54Updated last year
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Updated last year
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆25Updated last year
- Official Repository of LatentSeek☆30Updated last week
- ☆19Updated last year
- ☆40Updated 6 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆57Updated 7 months ago
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆18Updated 3 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆37Updated last year
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆86Updated last year
- ☆28Updated 2 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 6 months ago