chancharikmitra / CCoTView external linksLinks
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
☆145Jun 20, 2024Updated last year
Alternatives and similar repositories for CCoT
Users that are interested in CCoT are comparing it to the libraries listed below
Sorting:
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆44Sep 24, 2024Updated last year
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆49Mar 18, 2024Updated last year
- [ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…☆82Feb 22, 2025Updated 11 months ago
- [ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vis…☆24Jul 21, 2024Updated last year
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆424Dec 22, 2024Updated last year
- ☆20Jan 3, 2025Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆85Nov 10, 2024Updated last year
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆61Jul 16, 2024Updated last year
- [CVPR 2025] CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning☆37Apr 21, 2025Updated 9 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023☆56Feb 1, 2024Updated 2 years ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆85Oct 26, 2025Updated 3 months ago
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆19Jul 21, 2024Updated last year
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Nov 10, 2024Updated last year
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆57Oct 28, 2024Updated last year
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆50Jun 16, 2025Updated 7 months ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆86Jan 27, 2025Updated last year
- Using image captions with LLM for zero-shot VQA☆18Mar 14, 2024Updated last year
- Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM☆45Oct 12, 2024Updated last year
- The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…☆11Jul 28, 2025Updated 6 months ago
- [CVPRW 2024] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering☆24Apr 24, 2025Updated 9 months ago
- [COLM'25] Official implementation of the Law of Vision Representation in MLLMs☆176Oct 6, 2025Updated 4 months ago
- AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)☆34May 8, 2024Updated last year
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆24Sep 9, 2024Updated last year
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆22Jan 1, 2025Updated last year
- 【NeurIPS 2024】Dense Connector for MLLMs☆180Oct 14, 2024Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated last year
- [NeurIPS'24] Official implementation of paper "Unveiling the Tapestry of Consistency in Large Vision-Language Models".☆38Oct 23, 2024Updated last year
- KAIST medical VL research group☆20Dec 20, 2024Updated last year
- visual question answering prompting recipes for large vision-language models☆28Sep 14, 2024Updated last year
- ☆21Jul 25, 2022Updated 3 years ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆53Sep 29, 2025Updated 4 months ago
- Scaffold Prompting to promote LMMs☆46Dec 16, 2024Updated last year
- ☆15Sep 23, 2024Updated last year
- [CVPR 23] Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!☆17May 14, 2024Updated last year
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆360Jan 14, 2025Updated last year
- ☆360Jan 27, 2024Updated 2 years ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆111Oct 10, 2024Updated last year
- [ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models☆56Jul 9, 2024Updated last year