Doodling our way to AGI βοΈ πΌοΈ π§
β121May 29, 2025Updated 9 months ago
Alternatives and similar repositories for thinking-with-generated-images
Users that are interested in thinking-with-generated-images are comparing it to the libraries listed below
Sorting:
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.β28Dec 30, 2025Updated 2 months ago
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videosβ24Aug 8, 2025Updated 6 months ago
- More reliable Video Understanding Evaluationβ14Sep 23, 2025Updated 5 months ago
- Math-VR Benchmark & CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Imagesβ53Nov 4, 2025Updated 4 months ago
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)β19Jul 1, 2025Updated 8 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?β88Jul 13, 2025Updated 7 months ago
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Imagesβ56Jan 23, 2026Updated last month
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Modelsβ39Jan 5, 2026Updated 2 months ago
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual inβ¦β1,346Feb 3, 2026Updated last month
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoTβ125Jan 30, 2026Updated last month
- [NeurIPS 2025 DB] OneIG-Bench is a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models acroβ¦β108Feb 10, 2026Updated 3 weeks ago
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]β15Jul 15, 2025Updated 7 months ago
- UnicEdit-10M and UnicBench projectβ23Updated this week
- ChineseCLIP using online learningβ13Nov 7, 2022Updated 3 years ago
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Researchβ22Sep 23, 2025Updated 5 months ago
- SFT+RL boosts multimodal reasoningβ46Jun 27, 2025Updated 8 months ago
- β13Jan 22, 2025Updated last year
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updatesβ22Jul 1, 2025Updated 8 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How ModelβTask Alignment Induces Divergent RL Conclusions".β15Feb 9, 2026Updated 3 weeks ago
- [ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMsβ17May 21, 2025Updated 9 months ago
- β34Jan 25, 2026Updated last month
- β118Jul 22, 2025Updated 7 months ago
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decodingβ35Jan 16, 2026Updated last month
- MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMsβ39Feb 19, 2026Updated 2 weeks ago
- β50Oct 29, 2023Updated 2 years ago
- This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"β63Dec 29, 2025Updated 2 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasonersβ86May 21, 2025Updated 9 months ago
- β1,137Nov 20, 2025Updated 3 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ42Dec 29, 2025Updated 2 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]β45Jul 22, 2025Updated 7 months ago
- PICABench: How Far Are We from Physically Realistic Image Editing?β36Nov 5, 2025Updated 4 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Schemeβ147Apr 9, 2025Updated 10 months ago
- β20Jun 16, 2025Updated 8 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Rewardβ92Aug 8, 2025Updated 6 months ago
- Official code repository of Shuffle-R1β25Feb 23, 2026Updated last week
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.β19Oct 14, 2024Updated last year
- β43Jul 9, 2025Updated 7 months ago
- β19Mar 25, 2025Updated 11 months ago
- OmniGAIA: Towards Native Omni-Modal AI Agentsβ46Updated this week