Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆138Feb 25, 2026Updated last week
Alternatives and similar repositories for Monet
Users that are interested in Monet are comparing it to the libraries listed below
Sorting:
- ☆36Feb 6, 2026Updated last month
- Official repository for Scone (Subject-driven Composition and Distinction Enhancement) model, designed to support multi-subject compositi…☆28Jan 14, 2026Updated last month
- a unified reinforcement learning toolbox for joint RL on language models and diffusion models☆75Feb 7, 2026Updated last month
- Code for paper https://arxiv.org/abs/2501.00522☆14Apr 28, 2025Updated 10 months ago
- [CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens☆246Aug 2, 2025Updated 7 months ago
- Code for "StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model", AAAI2026 Oral☆45Jan 16, 2026Updated last month
- ☆88Dec 12, 2025Updated 2 months ago
- ☆26Jan 12, 2026Updated last month
- A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines☆32Updated this week
- [Arxiv 2025] In-Video Instructions: Visual Signals as Generative Control☆46Nov 25, 2025Updated 3 months ago
- The official paper summary of TMLR'25 paper "Survey of Video Diffusion Models: Foundations, Implementations, and Applications"☆37Feb 2, 2026Updated last month
- Official codebase for the paper Latent Visual Reasoning☆120Oct 22, 2025Updated 4 months ago
- RoMeO: Robust Metric Visual Odometry☆29Dec 16, 2024Updated last year
- ☆29May 7, 2025Updated 10 months ago
- A Rust library for reading and writing Foxglove MCAP files☆21Jan 29, 2023Updated 3 years ago
- [ICLR 2026] The official repository for the paper "AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning".☆72Feb 27, 2026Updated last week
- [ICCV 2025] Boosting MLLM Reasoning with Text-Debiased Hint-GRPO☆46Jul 1, 2025Updated 8 months ago
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆61Feb 6, 2026Updated last month
- Official Implementation of "LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis"☆78Aug 25, 2025Updated 6 months ago
- 3D Gaussian Splatting for underwater scene reconstruction via physcial-based appearance-medium decoupling☆23Feb 13, 2026Updated 3 weeks ago
- ☆30Dec 16, 2025Updated 2 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- Test-Time Memory Framework: Control Hallucinations in Foundation Models☆11Nov 4, 2025Updated 4 months ago
- The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).☆31Feb 24, 2026Updated last week
- [NeurIPS2025] ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model☆84Jan 8, 2026Updated last month
- Dexterous World Models☆74Feb 22, 2026Updated 2 weeks ago
- [NeurIPS'25 Spotlight] Official implementation of "JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation"☆69Feb 26, 2026Updated last week
- ☆33Jul 5, 2024Updated last year
- ☆10Aug 7, 2024Updated last year
- ☆14Aug 10, 2025Updated 6 months ago
- A free and open-source focus stacking software that supports multi-focus image alignment and fusion.☆20Feb 5, 2026Updated last month
- Repository with code supporting PNAS article☆11Jun 6, 2023Updated 2 years ago
- Arduino library for Gavesha® Robomatics Gear Motor.☆10Feb 15, 2025Updated last year
- Code used in the 2023 Alert Geomaterials doctoral school on Machine in Geomechanics☆13Oct 2, 2023Updated 2 years ago
- Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"☆65Dec 17, 2025Updated 2 months ago
- Animate Any Character in Any World☆90Jan 9, 2026Updated last month
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆87Mar 23, 2025Updated 11 months ago
- AI foundation and trend seminar tutorial with code☆24Oct 25, 2025Updated 4 months ago
- ☆12Sep 15, 2023Updated 2 years ago