xinyan-cxy/MINT-CoT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xinyan-cxy/MINT-CoT)

xinyan-cxy / MINT-CoT

[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

☆107

Alternatives and similar repositories for MINT-CoT

Users that are interested in MINT-CoT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jungao1106 / ICoT
View on GitHub
[CVPR' 25] Interleaved-Modal Chain-of-Thought
☆112Dec 30, 2025Updated 7 months ago
ZiyuGuo99 / Thinking-while-Generating
View on GitHub
The first Interleaved framework for textual reasoning within the visual generation process
☆165Mar 16, 2026Updated 4 months ago
xinyan-cxy / OpenCoF
View on GitHub
OpenCoF: Learning to Reason Through Video Generation
☆75Jul 10, 2026Updated 3 weeks ago
UMass-Embodied-AGI / Mirage
View on GitHub
[CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
☆294Aug 2, 2025Updated 11 months ago
Accio-Lab / SwimBird
View on GitHub
☆18Apr 9, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆423Jan 29, 2026Updated 6 months ago
MME-Benchmarks / MME-CoT
View on GitHub
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆136Aug 5, 2025Updated 11 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,498Mar 9, 2026Updated 4 months ago
hwanyu112 / Latent-Sketchpad
View on GitHub
☆73Feb 1, 2026Updated 5 months ago
Visual-Agent / DeepEyes
View on GitHub
☆1,255Nov 20, 2025Updated 8 months ago
ZiyuGuo99 / MME-CoF
View on GitHub
Are Video Models Ready as Zero-shot Reasoners?
☆87Nov 24, 2025Updated 8 months ago
TIGER-AI-Lab / Pixel-Reasoner
View on GitHub
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
☆301Jun 4, 2026Updated last month
cxfann / Flame
View on GitHub
☆15May 19, 2026Updated 2 months ago
VincentLeebang / lvr
View on GitHub
Official codebase for the paper Latent Visual Reasoning
☆172Oct 22, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
NOVAglow646 / Monet
View on GitHub
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆216Mar 19, 2026Updated 4 months ago
ThinkMorph / ThinkMorph
View on GitHub
[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
☆192May 1, 2026Updated 2 months ago
cythu / PeBR-R1
View on GitHub
☆15Apr 20, 2026Updated 3 months ago
ZrrSkywalker / MAVIS
View on GitHub
[ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models
☆156Dec 5, 2024Updated last year
ZiyuGuo99 / ATLAS
View on GitHub
One Discrete Word for Visual Reasoning Overtakes Agentic and Latent Methods
☆137Jun 9, 2026Updated last month
Ivan-Tang-3D / ENEL
View on GitHub
[ICLR 2026]The official implementation of The paper "Exploring the Potential of Encoder-free Architectures in 3D LMMs"
☆11Jan 26, 2026Updated 6 months ago
ZrrSkywalker / MathVerse
View on GitHub
[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
☆183Apr 28, 2025Updated last year
CaraJ7 / DraCo
View on GitHub
Offical Repository for Paper: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
☆18Dec 7, 2025Updated 7 months ago
shilinyan99 / CrossLMM
View on GitHub
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
☆25Dec 21, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
li-qi-lin / USTC-Course-Materials-Sharing
View on GitHub
This repository shares undergraduate course materials for the Electronic Information Engineering program at the University of Science and…
☆68Mar 10, 2026Updated 4 months ago
w-yibo / VTC-R1
View on GitHub
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning.
☆26Jul 20, 2026Updated last week
alchemistyzz / PeRL
View on GitHub
[NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"
☆30Mar 30, 2026Updated 4 months ago
open-compass / ProSA
View on GitHub
[EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
☆29May 22, 2025Updated last year
zsgvivo / VideoZoomer
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
marinero4972 / Open-o3-Video
View on GitHub
[ICML 2026] Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
☆158May 1, 2026Updated 2 months ago
Lucky-Lance / SPP
View on GitHub
[ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
☆22May 28, 2024Updated 2 years ago
uni-medical / GMAI-VL-R1
View on GitHub
☆19Jul 21, 2025Updated last year
Haochen-Wang409 / TreeVGR
View on GitHub
[ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
☆91Jan 26, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yaotingwangofficial / Awesome-MCoT
View on GitHub
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
☆1,016May 22, 2026Updated 2 months ago
PKU-YuanGroup / Look-Back
View on GitHub
This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".
☆100Jul 10, 2025Updated last year
Gabesarch / grounded-rl
View on GitHub
☆133Jul 22, 2025Updated last year
agents-x-project / PyVision
View on GitHub
[MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."
☆163Jul 22, 2025Updated last year
jun297 / v1
View on GitHub
v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning
☆21Jul 20, 2026Updated last week
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆190Jun 5, 2025Updated last year
Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs
View on GitHub
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…
☆1,438May 11, 2026Updated 2 months ago