Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1
☆63Mar 18, 2025Updated last year
Alternatives and similar repositories for Awesome-Reasoning-MLLM
Users that are interested in Awesome-Reasoning-MLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Awesome LLM papers, news and projects about learning to reason with LLM, OpenAI o1, reasonning techniques, chain-of-thought (COT), Large …☆28Oct 10, 2024Updated last year
- Collections of Papers and Projects for Multimodal Reasoning.☆108Apr 25, 2025Updated last year
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆94Aug 8, 2025Updated 8 months ago
- Code for the paper "Controllable Video Captioning with an Exemplar Sentence"☆12Apr 14, 2021Updated 5 years ago
- Latest Advances on Reasoning of Multimodal Large Language Models (Multimodal R1 \ Visual R1) ) 🍓☆36Apr 3, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code accompanying our EMNLP 2019 paper: "Revisiting the Evaluation of Theory of Mind through Question Answering"☆27Aug 9, 2020Updated 5 years ago
- A Holistic Embodied Cognition Benchmark☆19Apr 3, 2025Updated last year
- A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.☆590Apr 1, 2026Updated last month
- Latest Advances on Long Chain-of-Thought Reasoning☆630Jul 18, 2025Updated 9 months ago
- ☆68Feb 4, 2026Updated 3 months ago
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆13Mar 15, 2025Updated last year
- [ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval☆17Aug 24, 2022Updated 3 years ago
- [NeurIPS 2025] Reasoning MLLM, Share-GRPO, advantage vanishing, sparse reward☆36Sep 19, 2025Updated 7 months ago
- [KDD 2026] Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe☆32Aug 10, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆45Dec 16, 2025Updated 4 months ago
- Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey☆989Nov 14, 2025Updated 5 months ago
- The original code for the data providers and the datasets of the paper "Defining Benchmarks for Continual Few-Shot Learning".☆16Apr 15, 2020Updated 6 years ago
- The KlicStudio MCP server is a connector based on the Model Context Protocol (MCP), designed to facilitate interactions with KlicStudio s…☆21Jul 30, 2025Updated 9 months ago
- [ACL 2025 Main] SceneGenAgent: Precise Industrial Scene Generation with Coding Agent☆36Nov 29, 2024Updated last year
- Official repository of "TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection"☆11May 25, 2025Updated 11 months ago
- ☆110Sep 11, 2025Updated 7 months ago
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- OpenMediation SDK Server☆15Oct 4, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The official repo for [ACM CSUR'24] "Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Op…☆12Dec 6, 2024Updated last year
- [Findings of EMNLP 2022] AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant☆23Sep 11, 2023Updated 2 years ago
- 本项目提供了基于910B的huggingface LLM模型的Tensor Parallel(TP)部署教程, 同时也可以作为一份极简的TP学习代码。☆32Jan 6, 2026Updated 4 months ago
- My implementation of the vehicle anomaly detection from https://github.com/ShuaiBai623/AI-City-Anomaly-Detection☆10Aug 30, 2019Updated 6 years ago
- wireshark lab参考答案,计算机网络;The answer of wireshark lab,just for reference.☆10Apr 15, 2018Updated 8 years ago
- 该系列的目的是让读者可以在基础的pytorch上,不依赖任何其他现成的外部库,从零开始理解并实现一个大语言模型的所有组成部分,以及训练微调代码,因此读者仅需python,pytorch和最基础深度学习背景知识即可。☆387Aug 28, 2025Updated 8 months ago
- ICS_2020_PJ☆11Dec 25, 2020Updated 5 years ago
- auto star for repo lists☆10Aug 26, 2023Updated 2 years ago
- utilities to deal with videos ...☆15Jul 27, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs☆38Mar 9, 2025Updated last year
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projec…☆38Jan 6, 2026Updated 4 months ago
- Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement☆17Nov 11, 2024Updated last year
- Advanced Machine Learning Fall 2020 Project Repository☆12Dec 12, 2020Updated 5 years ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Jul 21, 2023Updated 2 years ago
- 一个支持跨模态大语言模型的webui. A chatbot webui that supports various multi-modal large language models☆11May 8, 2023Updated 2 years ago
- Build Your Own Bundle-A Neural Combinatorial Optimization Method (BYOB)☆13Apr 27, 2022Updated 4 years ago