NOVAglow646/Monet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NOVAglow646/Monet)

NOVAglow646 / Monet

[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"

☆215

Alternatives and similar repositories for Monet

Users that are interested in Monet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VincentLeebang / lvr
View on GitHub
Official codebase for the paper Latent Visual Reasoning
☆171Oct 22, 2025Updated 9 months ago
UMass-Embodied-AGI / Mirage
View on GitHub
[CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
☆294Aug 2, 2025Updated 11 months ago
UCSB-AI / DMLR
View on GitHub
[CVPR2026] Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"
☆85May 12, 2026Updated 2 months ago
Wakals / CoVT
View on GitHub
[ECCV 2026] Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"
☆379Apr 17, 2026Updated 3 months ago
Svardfox / LaViT
View on GitHub
Official codebase for the paper LaViT
☆34Feb 15, 2026Updated 5 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
TungChintao / SkiLa
View on GitHub
Official codes of "Sketch-in-Latents: Eliciting Unified Reasoning in MLLMs"
☆17Feb 15, 2026Updated 5 months ago
heliossun / LaCoT
View on GitHub
[NeurIPS 2025] Official code for paper: Latent Chain-of-Thought for Visual Reasoning
☆36Oct 16, 2025Updated 9 months ago
ybb6 / laser
View on GitHub
☆35Apr 22, 2026Updated 3 months ago
XD111ds / ILVR
View on GitHub
[ACL'26 Oral] Interleaved Latent Visual Reasoning with Selective Perceptual Modeling
☆66May 29, 2026Updated 2 months ago
FYYDCC / IVT-LR
View on GitHub
Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”
☆18Jan 27, 2026Updated 6 months ago
hwanyu112 / Latent-Sketchpad
View on GitHub
☆73Feb 1, 2026Updated 5 months ago
ModalityDance / Omni-R1
View on GitHub
[ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"
☆63May 26, 2026Updated 2 months ago
xlyu0106 / VisMem
View on GitHub
☆91Feb 5, 2026Updated 5 months ago
ZiyuGuo99 / ATLAS
View on GitHub
One Discrete Word for Visual Reasoning Overtakes Agentic and Latent Methods
☆137Jun 9, 2026Updated last month
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
xlyu0106 / Awesome-Latent-Space
View on GitHub
A paper list of Awesome Latent Space.
☆950Jul 13, 2026Updated 2 weeks ago
zhengdian1 / AIA
View on GitHub
☆45Jan 4, 2026Updated 6 months ago
TencentBAC / RoT
View on GitHub
[ACL 2026] Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning
☆93Jan 22, 2026Updated 6 months ago
FanmengWang / ReGuLaR
View on GitHub
The official implementation of “ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought”
☆53Feb 2, 2026Updated 5 months ago
EIT-NLP / Awesome-Latent-CoT
View on GitHub
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
☆366Jun 20, 2026Updated last month
Visual-Agent / DeepEyes
View on GitHub
☆1,251Nov 20, 2025Updated 8 months ago
ThinkMorph / ThinkMorph
View on GitHub
[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
☆192May 1, 2026Updated 2 months ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 5 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,497Mar 9, 2026Updated 4 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
yangzhangok / crystal
View on GitHub
official repository of article "CrystaL: Spontaneous Emergence of Visual Latents in MLLMs"
☆18May 26, 2026Updated 2 months ago
zlab-princeton / vero
View on GitHub
Vero: An Open RL Recipe for General Visual Reasoning
☆137Jun 19, 2026Updated last month
Ryann-Ran / Scone
View on GitHub
(CVPR 2026 Highlight) Official repository for Scone (Subject-driven COmposition and DistinctioN Enhancement) model, supporting subject co…
☆32Apr 9, 2026Updated 3 months ago
Accio-Lab / SwimBird
View on GitHub
☆18Apr 9, 2026Updated 3 months ago
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆423Jan 29, 2026Updated 6 months ago
MikeWangWZHL / PAPO
View on GitHub
Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"
☆153Feb 4, 2026Updated 5 months ago
AI9Stars / CapImagine
View on GitHub
[ICML2026] Imagination Helps Visual Reasoning, But Not Yet in Latent Space
☆28May 4, 2026Updated 2 months ago
zss02 / BiPS
View on GitHub
[CVPR 2026] See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
☆22Jun 28, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xinyan-cxy / MINT-CoT
View on GitHub
[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
☆107Sep 19, 2025Updated 10 months ago
sen-ye / R3
View on GitHub
[ICLR26] Understanding VS. Generation: Navigating Optimization Dilemma in Multimodal Models
☆25May 6, 2026Updated 2 months ago
inclusionAI / Zooming-without-Zooming
View on GitHub
[ICML 2026] ZwZ model family: SOTA fine-grained perception performace; ZoomBench: a new challenging perception benchmark
☆179May 4, 2026Updated 2 months ago
rootyJeon / Vision-aligned-Latent-Reasoning
View on GitHub
[ICML 2026] Official implementation of Vision-aligned Latent Reasoning for Multi-modal Large Language Model (VaLR)
☆20Apr 30, 2026Updated 2 months ago
Vchitect / Uni-MMMU
View on GitHub
[ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark
☆26Apr 13, 2026Updated 3 months ago
Yui010206 / Adaptive-Visual-Imagination-Control
View on GitHub
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
☆18Jun 2, 2026Updated last month
CodeDance-VL / CodeDance
View on GitHub
☆32Mar 17, 2026Updated 4 months ago