hwanyu112/Latent-Sketchpad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hwanyu112/Latent-Sketchpad)

hwanyu112 / Latent-Sketchpad

☆73

Alternatives and similar repositories for Latent-Sketchpad

Users that are interested in Latent-Sketchpad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UMass-Embodied-AGI / Mirage
View on GitHub
[CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
☆294Aug 2, 2025Updated 11 months ago
VincentLeebang / lvr
View on GitHub
Official codebase for the paper Latent Visual Reasoning
☆172Oct 22, 2025Updated 9 months ago
hwanyu112 / VIBE-Benchmark
View on GitHub
☆27Feb 3, 2026Updated 5 months ago
NOVAglow646 / Monet
View on GitHub
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆216Mar 19, 2026Updated 4 months ago
ThinkMorph / ThinkMorph
View on GitHub
[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
☆192May 1, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 6 months ago
chengzu-li / MVoT
View on GitHub
Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)
☆78Apr 12, 2025Updated last year
LaVi-Lab / Rethink_CoT_Video
View on GitHub
Official code for "Rethinking Chain-of-Thought Reasoning for Videos"
☆21Dec 14, 2025Updated 7 months ago
XD111ds / ILVR
View on GitHub
[ACL'26 Oral] Interleaved Latent Visual Reasoning with Selective Perceptual Modeling
☆66May 29, 2026Updated 2 months ago
arctanxarc / GENIUS
View on GitHub
☆43May 9, 2026Updated 2 months ago
yfzhang114 / SliME
View on GitHub
✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
☆163Dec 26, 2024Updated last year
arijitray1993 / SAT
View on GitHub
Spatial Aptitude Training for Multimodal Langauge Models
☆33Feb 8, 2026Updated 5 months ago
Wakals / CoVT
View on GitHub
[ECCV 2026] Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"
☆386Updated this week
ybb6 / laser
View on GitHub
☆35Apr 22, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xinyan-cxy / MINT-CoT
View on GitHub
[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
☆107Sep 19, 2025Updated 10 months ago
ByteDance-BandAI / CodeVision
View on GitHub
[CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images
☆71Jan 23, 2026Updated 6 months ago
ShareLab-SII / CaTok
View on GitHub
[CVPR-26] Official repository of "CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization"
☆19Mar 9, 2026Updated 4 months ago
heliossun / LaCoT
View on GitHub
[NeurIPS 2025] Official code for paper: Latent Chain-of-Thought for Visual Reasoning
☆36Oct 16, 2025Updated 9 months ago
TungChintao / SkiLa
View on GitHub
Official codes of "Sketch-in-Latents: Eliciting Unified Reasoning in MLLMs"
☆17Feb 15, 2026Updated 5 months ago
yfzhang114 / Thyme
View on GitHub
✨✨ [ICLR 2026] Think Beyond Images
☆584Sep 23, 2025Updated 10 months ago
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 5 months ago
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆97Jul 27, 2025Updated last year
CYWang735 / AdaTooler-V
View on GitHub
☆72Feb 27, 2026Updated 5 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ChoS3nE11ven / Agentic-MME
View on GitHub
☆36Apr 13, 2026Updated 3 months ago
ModalityDance / Omni-R1
View on GitHub
[ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"
☆63May 26, 2026Updated 2 months ago
FrankYang-17 / RealUnify
View on GitHub
☆28Oct 10, 2025Updated 9 months ago
om-ai-lab / ZoomEye
View on GitHub
[EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
☆91Nov 20, 2025Updated 8 months ago
ulab-uiuc / GraphEval
View on GitHub
[ICLR 2025] "GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation", Tao Feng, Yihang Sun, Jiaxuan You
☆17Mar 18, 2025Updated last year
UCSB-AI / DMLR
View on GitHub
[CVPR2026] Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"
☆85May 12, 2026Updated 2 months ago
UniPat-AI / BabyVision
View on GitHub
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
☆235Jan 13, 2026Updated 6 months ago
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
inclusionAI / Zooming-without-Zooming
View on GitHub
[ICML 2026] ZwZ model family: SOTA fine-grained perception performace; ZoomBench: a new challenging perception benchmark
☆183May 4, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OpenCausaLab / CauSight
View on GitHub
CauSight: Learning to Supersense for Visual Causal Discovery
☆21Mar 11, 2026Updated 4 months ago
shiwk24 / MathCanvas
View on GitHub
This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"
☆80Apr 14, 2026Updated 3 months ago
ZiyuGuo99 / ATLAS
View on GitHub
One Discrete Word for Visual Reasoning Overtakes Agentic and Latent Methods
☆137Jun 9, 2026Updated last month
FYYDCC / IVT-LR
View on GitHub
Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”
☆18Jan 27, 2026Updated 6 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,498Mar 9, 2026Updated 4 months ago
MajorDavidZhang / Generalization_unified_VLM
View on GitHub
☆24May 23, 2025Updated last year
ZiyuGuo99 / Thinking-while-Generating
View on GitHub
The first Interleaved framework for textual reasoning within the visual generation process
☆165Mar 16, 2026Updated 4 months ago