Doodling our way to AGI βοΈ πΌοΈ π§
β126May 29, 2025Updated last year
Alternatives and similar repositories for thinking-with-generated-images
Users that are interested in thinking-with-generated-images are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videosβ28Jun 4, 2026Updated 2 weeks ago
- This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"β77Apr 14, 2026Updated 2 months ago
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual inβ¦β1,485Mar 9, 2026Updated 3 months ago
- Math-VR Benchmark & CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Imagesβ64Nov 4, 2025Updated 7 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.β38Dec 30, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoTβ136Jan 30, 2026Updated 4 months ago
- β20Jan 26, 2025Updated last year
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Schemeβ149Apr 9, 2025Updated last year
- [NeurIPS 2025 DB] OneIG-Bench is a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models acroβ¦β120Feb 10, 2026Updated 4 months ago
- β1,237Nov 20, 2025Updated 6 months ago
- VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation [TMLR26]β17Jun 1, 2026Updated 2 weeks ago
- β51Oct 29, 2023Updated 2 years ago
- β48May 16, 2026Updated last month
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)β19Jul 1, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official code repository of Shuffle-R1β26Feb 23, 2026Updated 3 months ago
- DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelangβ44Nov 19, 2025Updated 7 months ago
- [ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"β187May 1, 2026Updated last month
- KnowRL: Exploring Knowledgeable Reinforcement Learning for Factualityβ47May 19, 2026Updated last month
- β129Jul 22, 2025Updated 10 months ago
- [ICLR 2026] Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusioβ¦β116Apr 13, 2026Updated 2 months ago
- [Extended verision ICLR 2025 Blog Track] Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generatioβ¦β841Jun 16, 2025Updated last year
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Modelsβ44Jan 5, 2026Updated 5 months ago
- A hand-made OS core for National College Students Computer System Ability Competitionβ11Aug 28, 2021Updated 4 years ago
- End-to-end encrypted email - Proton Mail β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- GenEval: An object-focused framework for evaluating text-to-image alignmentβ459Mar 3, 2025Updated last year
- SFT+RL boosts multimodal reasoningβ50Jun 27, 2025Updated 11 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Rewardβ95Aug 8, 2025Updated 10 months ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMsβ47Mar 27, 2026Updated 2 months ago
- [NeurIPS 2025 D&B Track] MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Researchβ30May 8, 2026Updated last month
- [ICLR26] GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learningβ106Jan 27, 2026Updated 4 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ43Dec 29, 2025Updated 5 months ago
- [NIPS 25'] Evaluation code of paper "KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models"β45Oct 19, 2025Updated 8 months ago
- [CVPR 2026] UnicEdit-10M and UnicBench projectβ41Mar 3, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β23Apr 15, 2025Updated last year
- [CVPR 2026] Official implementation of FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-and-Language Navigationβ32Feb 23, 2026Updated 3 months ago
- Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoningβ238May 30, 2025Updated last year
- [CVPR 2025] π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".β465Aug 8, 2025Updated 10 months ago
- β23Jun 16, 2025Updated last year
- Security-native LLM system for AI-generated application security.β263Jun 4, 2026Updated 2 weeks ago
- [ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"β62May 26, 2026Updated 3 weeks ago