Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1
☆62Mar 18, 2025Updated 11 months ago
Alternatives and similar repositories for Awesome-Reasoning-MLLM
Users that are interested in Awesome-Reasoning-MLLM are comparing it to the libraries listed below
Sorting:
- Collections of Papers and Projects for Multimodal Reasoning.☆107Apr 25, 2025Updated 10 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆92Feb 14, 2025Updated last year
- Latest Advances on Reasoning of Multimodal Large Language Models (Multimodal R1 \ Visual R1) ) 🍓☆36Apr 3, 2025Updated 11 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆92Aug 8, 2025Updated 7 months ago
- A Holistic Embodied Cognition Benchmark☆18Apr 3, 2025Updated 11 months ago
- ☆40Dec 16, 2025Updated 2 months ago
- Latest Advances on Long Chain-of-Thought Reasoning☆615Jul 18, 2025Updated 7 months ago
- [LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)☆68May 9, 2025Updated 10 months ago
- ☆64Feb 4, 2026Updated last month
- [ACL 2025 Main] SceneGenAgent: Precise Industrial Scene Generation with Coding Agent☆35Nov 29, 2024Updated last year
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.☆572Feb 22, 2026Updated 2 weeks ago
- You ship an iOS app, we ship an Apple developer license.☆26Oct 3, 2025Updated 5 months ago
- Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey☆958Nov 14, 2025Updated 3 months ago
- A comprehensive React Native starter template built with Expo. It includes reusable UI components, Poppins font setup, NativeWind, Fireba…☆23Updated this week
- A simple lightweight Model Context Protocol (MCP) server integration framework☆17Jan 23, 2026Updated last month
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projec…☆34Jan 6, 2026Updated 2 months ago
- 本项目提供了基于910B的huggingface LLM模型的Tensor Parallel(TP)部署教程,同时也可以作为一份极简的TP学习代码。☆32Jan 6, 2026Updated 2 months ago
- This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs☆37Mar 9, 2025Updated last year
- Structured TRIZ prompt engineering for LLMs in an open, portable XML format – MIT licensed.☆16Nov 11, 2025Updated 3 months ago
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆25May 26, 2025Updated 9 months ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Jul 21, 2023Updated 2 years ago
- AuraMatrix is personality analysis web which using llm to do evaluation. I have made this for Gyanotsav-2025 to show different ways to ut…☆11Dec 22, 2025Updated 2 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆106Dec 30, 2025Updated 2 months ago
- Experimental paper writing linter.☆35Sep 2, 2024Updated last year
- Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]☆835Dec 14, 2025Updated 2 months ago
- GPT-4V(ision) as A Social Media Analysis Engine☆38Dec 20, 2024Updated last year
- My old 2017-2018 menu template, for iOS. Hopefully some of you find it useful.☆10Feb 15, 2022Updated 4 years ago
- Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models☆47Oct 30, 2025Updated 4 months ago
- Import iOS 15 shortcuts on 13/14☆13Apr 5, 2022Updated 3 years ago
- MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces☆10Mar 24, 2025Updated 11 months ago
- WeChat official account crawler 微信公众号爬虫☆12Apr 13, 2024Updated last year
- VibEx (vx) is a developer-friendly CLI tool that streamlines the process of working with AI coding assistants. It helps developers prepar…☆29May 17, 2025Updated 9 months ago
- Glitch Gremlin AI☆15Apr 5, 2025Updated 11 months ago
- Set screen resolution on all iOS versions.☆14Sep 2, 2025Updated 6 months ago
- CoachLint is your AI coding coach. It guides you through errors instead of just solving them for you.☆23Nov 20, 2025Updated 3 months ago
- Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]☆137Sep 29, 2024Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated 2 years ago