ADaM-BJTU / Mind_with_eyes_Awesome_MLLMs_ReasoningView external linksLinks
This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!
☆55Mar 21, 2025Updated 10 months ago
Alternatives and similar repositories for Mind_with_eyes_Awesome_MLLMs_Reasoning
Users that are interested in Mind_with_eyes_Awesome_MLLMs_Reasoning are comparing it to the libraries listed below
Sorting:
- Official repository for CoMM Dataset☆49Dec 31, 2024Updated last year
- ☆12Jul 16, 2025Updated 7 months ago
- ☆13Feb 24, 2025Updated 11 months ago
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward☆60Nov 27, 2025Updated 2 months ago
- ☆19May 19, 2024Updated last year
- Accepted LLM Papers in NeurIPS 2024☆37Oct 13, 2024Updated last year
- ☆11Updated this week
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆106Dec 30, 2025Updated last month
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆136Aug 5, 2025Updated 6 months ago
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆84Jul 10, 2025Updated 7 months ago
- Train deepseek r1-like reasoning LLM with ease | 轻松训练1个deepseek r1类的推理LLM☆18Feb 15, 2025Updated last year
- [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆191Mar 17, 2025Updated 11 months ago
- MCP DeepResearch Server: 基于 LangGraph + Ollama + Tavily 的深度研究服务器,支持异步运行、超时控制与进度推送☆31Jun 16, 2025Updated 8 months ago
- ☆19Dec 6, 2023Updated 2 years ago
- Rough LLM Interpreter of ComfyUI☆28Jan 23, 2025Updated last year
- 本项目借助飞桨平台,构建起一套创新的多模型协同系统,实现 PDF 文件到 Markdown 文件的高效、精准转换。☆27Mar 25, 2025Updated 10 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆251Oct 17, 2025Updated 4 months ago
- python sdk for dashscope [model studio](https://www.alibabacloud.com/en/product/modelstudio?_p_lc=1)☆43Feb 9, 2026Updated last week
- Collections of Papers and Projects for Multimodal Reasoning.☆107Apr 25, 2025Updated 9 months ago
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆130Mar 18, 2025Updated 10 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32May 15, 2023Updated 2 years ago
- A library of visualization tools for the interpretability and hallucination analysis of large vision-language models (LVLMs).☆41May 22, 2025Updated 8 months ago
- A simple WeChat Official Account layout tool based on Dify☆16Jun 27, 2025Updated 7 months ago
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 8 months ago
- Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"☆59Dec 17, 2025Updated 2 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆45Jul 22, 2025Updated 6 months ago
- R1-like Computer-use Agent☆89Mar 21, 2025Updated 10 months ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆25Jan 6, 2026Updated last month
- ☆28Dec 4, 2025Updated 2 months ago
- 100 Production-Ready Claude Code Skills - The most comprehensive collection of AI skills for sales, business automation, content creation…☆35Oct 22, 2025Updated 3 months ago
- ☆11Aug 29, 2025Updated 5 months ago
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated last month
- Our survey's paper list on Agentic AI, continuously updated with the latest research.☆88Oct 28, 2025Updated 3 months ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆75May 31, 2025Updated 8 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆155Dec 24, 2024Updated last year
- 🔥Awesome Multimodal Large Language Models Paper List☆154Mar 12, 2025Updated 11 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆86Mar 21, 2024Updated last year
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- An SSH plugin for Dify☆12Jan 16, 2026Updated last month