Doodling our way to AGI βοΈ πΌοΈ π§
β123May 29, 2025Updated 10 months ago
Alternatives and similar repositories for thinking-with-generated-images
Users that are interested in thinking-with-generated-images are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videosβ25Aug 8, 2025Updated 8 months ago
- This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"β67Dec 29, 2025Updated 3 months ago
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual inβ¦β1,419Mar 9, 2026Updated last month
- Math-VR Benchmark & CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Imagesβ56Nov 4, 2025Updated 5 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?β91Jul 13, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoTβ132Jan 30, 2026Updated 2 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.β36Dec 30, 2025Updated 3 months ago
- β19Jan 26, 2025Updated last year
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Schemeβ147Apr 9, 2025Updated last year
- [NeurIPS 2025 DB] OneIG-Bench is a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models acroβ¦β116Feb 10, 2026Updated 2 months ago
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Imagesβ68Jan 23, 2026Updated 2 months ago
- β1,189Nov 20, 2025Updated 4 months ago
- More reliable Video Understanding Evaluationβ15Sep 23, 2025Updated 6 months ago
- β51Oct 29, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- β46Mar 24, 2026Updated 3 weeks ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMsβ45Mar 27, 2026Updated 3 weeks ago
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)β19Jul 1, 2025Updated 9 months ago
- Official code repository of Shuffle-R1β25Feb 23, 2026Updated last month
- DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelangβ44Nov 19, 2025Updated 5 months ago
- [ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"β178Jan 26, 2026Updated 2 months ago
- β123Jul 22, 2025Updated 8 months ago
- [ICLR 2026] Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusioβ¦β112Updated this week
- EARL: Editing with Autoregression and RLβ42Nov 21, 2025Updated 4 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Modelsβ41Jan 5, 2026Updated 3 months ago
- Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generationβ835Jun 16, 2025Updated 10 months ago
- GenEval: An object-focused framework for evaluating text-to-image alignmentβ445Mar 3, 2025Updated last year
- β30Jul 2, 2024Updated last year
- SFT+RL boosts multimodal reasoningβ48Jun 27, 2025Updated 9 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Rewardβ93Aug 8, 2025Updated 8 months ago
- [ICLR26] GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learningβ105Jan 27, 2026Updated 2 months ago
- [NeurIPS 2025 D&B Track] MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Researchβ26Sep 23, 2025Updated 6 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ42Dec 29, 2025Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [CVPR 2026] UnicEdit-10M and UnicBench projectβ40Mar 3, 2026Updated last month
- β22Apr 15, 2025Updated last year
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detectionβ13Apr 12, 2024Updated 2 years ago
- Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoningβ238May 30, 2025Updated 10 months ago
- β16Nov 18, 2023Updated 2 years ago
- "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"β62Jan 28, 2026Updated 2 months ago
- Code for "Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation"β25Mar 9, 2024Updated 2 years ago