Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)
☆70Apr 12, 2025Updated 10 months ago
Alternatives and similar repositories for MVoT
Users that are interested in MVoT are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Scaling Language-centric Omnimodal Representation Learning☆33Feb 6, 2026Updated last month
- [Blog 1] Recording a bug of grpo_trainer in some R1 projects☆22Feb 23, 2025Updated last year
- A Self-Training Framework for Vision-Language Reasoning☆88Jan 23, 2025Updated last year
- An Open-Source RAG Workload Trace to Optimize RAG Serving Systems☆35Nov 18, 2025Updated 3 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆183Jun 5, 2025Updated 9 months ago
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆61Jul 26, 2024Updated last year
- ☆59Jun 20, 2024Updated last year
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆101May 20, 2025Updated 9 months ago
- Open-Pandora: On-the-fly Control Video Generation☆35Nov 28, 2024Updated last year
- [ICLR 2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"☆51Feb 4, 2026Updated last month
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Jul 21, 2023Updated 2 years ago
- ☆33Oct 31, 2024Updated last year
- o1 Chain of Thought Examples☆33Oct 4, 2024Updated last year
- MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models☆32Jan 22, 2025Updated last year
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Jul 1, 2024Updated last year
- ☆25Sep 1, 2025Updated 6 months ago
- ☆88Jun 7, 2024Updated last year
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆842May 14, 2025Updated 9 months ago
- Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models☆12Nov 1, 2025Updated 4 months ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆22Nov 11, 2025Updated 3 months ago
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆18Feb 14, 2026Updated 3 weeks ago
- [ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"☆11Sep 3, 2024Updated last year
- CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization☆13Aug 3, 2024Updated last year
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 7 months ago
- ☆13Nov 5, 2024Updated last year
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆114Dec 24, 2025Updated 2 months ago
- ☆14Sep 17, 2024Updated last year
- ☆17Dec 23, 2025Updated 2 months ago
- Official repository for "Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilizat…☆19Oct 24, 2025Updated 4 months ago
- End-to-end implementation of the Social Graph Network (SGN), described in the Structural Reasoning for Image-based Social Relation Recogn…☆13Apr 3, 2024Updated last year
- ☆15Feb 11, 2025Updated last year
- The official implementation of Hard Negative Sampling via Large Language Models for Recommendation.☆11Jan 17, 2026Updated last month
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆22Sep 23, 2025Updated 5 months ago
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆21Jun 12, 2025Updated 8 months ago
- Personalized Fragrance Recommendation for Aromatherapy: A Machine Learning Approach Based on Personality Traits and Electrodermal Activit…☆14May 1, 2025Updated 10 months ago
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆15Aug 6, 2025Updated 7 months ago
- Memory experiments with LLMs☆10Mar 31, 2023Updated 2 years ago
- ☆10Jan 28, 2024Updated 2 years ago
- LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration☆11Mar 11, 2024Updated last year