chengzu-li/MVoT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chengzu-li/MVoT)

chengzu-li / MVoT

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)

☆70

Alternatives and similar repositories for MVoT

Users that are interested in MVoT are comparing it to the libraries listed below

Sorting:

LCO-Embedding / LCO-Embedding
View on GitHub
[NeurIPS 2025] Scaling Language-centric Omnimodal Representation Learning
☆33Feb 6, 2026Updated last month
Hui-design / R1-Video-fixbug
View on GitHub
[Blog 1] Recording a bug of grpo_trainer in some R1 projects
☆22Feb 23, 2025Updated last year
njucckevin / MM-Self-Improve
View on GitHub
A Self-Training Framework for Vision-Language Reasoning
☆88Jan 23, 2025Updated last year
flashserve / RAGPulse
View on GitHub
An Open-Source RAG Workload Trace to Optimize RAG Serving Systems
☆35Nov 18, 2025Updated 3 months ago
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆183Jun 5, 2025Updated 9 months ago
Kevinz-code / SeVa
View on GitHub
[MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501
☆61Jul 26, 2024Updated last year
ZhangYiqun018 / StickerConv
View on GitHub
☆59Jun 20, 2024Updated last year
chenllliang / G1
View on GitHub
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆101May 20, 2025Updated 9 months ago
OpenSparseLLMs / Open-Pandora
View on GitHub
Open-Pandora: On-the-fly Control Video Generation
☆35Nov 28, 2024Updated last year
tmlr-group / Co-rewarding
View on GitHub
[ICLR 2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"
☆51Feb 4, 2026Updated last month
TencentARC / pi-Tuning
View on GitHub
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Jul 21, 2023Updated 2 years ago
cmu-mind / RISE
View on GitHub
☆33Oct 31, 2024Updated last year
bradhilton / o1-chain-of-thought
View on GitHub
o1 Chain of Thought Examples
☆33Oct 4, 2024Updated last year
pengshuai-rin / MultiMath
View on GitHub
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models
☆32Jan 22, 2025Updated last year
jihaonew / MM-Instruct
View on GitHub
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
☆35Jul 1, 2024Updated last year
yiyangzhang-hz / PPT
View on GitHub
☆25Sep 1, 2025Updated 6 months ago
LightChen233 / M3CoT
View on GitHub
☆88Jun 7, 2024Updated last year
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆842May 14, 2025Updated 9 months ago
LanceZPF / MDK12
View on GitHub
Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
☆12Nov 1, 2025Updated 4 months ago
Luowaterbi / TokenRecycling
View on GitHub
[ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
☆22Nov 11, 2025Updated 3 months ago
ltpo2025 / LTPO
View on GitHub
[ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization
☆18Feb 14, 2026Updated 3 weeks ago
Gary-code / KECVQG
View on GitHub
[ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"
☆11Sep 3, 2024Updated last year
k1l1 / CoCoFL
View on GitHub
CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization
☆13Aug 3, 2024Updated last year
LUMIA-Group / PonderingLM
View on GitHub
Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"
☆25Jul 21, 2025Updated 7 months ago
MM-FIRE / FIRE
View on GitHub
☆13Nov 5, 2024Updated last year
ZhangXJ199 / TinyLLaVA-Video-R1
View on GitHub
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
☆114Dec 24, 2025Updated 2 months ago
it-hao / MITNet
View on GitHub
☆14Sep 17, 2024Updated last year
mansicer / self-verification
View on GitHub
☆17Dec 23, 2025Updated 2 months ago
Connoriginal / MEMENTO
View on GitHub
Official repository for "Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilizat…
☆19Oct 24, 2025Updated 4 months ago
verlab / Structural_Reasoning_SRR
View on GitHub
End-to-end implementation of the Social Graph Network (SGN), described in the Structural Reasoning for Image-based Social Relation Recogn…
☆13Apr 3, 2024Updated last year
LyWang12 / CUTI-Domain
View on GitHub
☆15Feb 11, 2025Updated last year
user683 / HNLMRec
View on GitHub
The official implementation of Hard Negative Sampling via Large Language Models for Recommendation.
☆11Jan 17, 2026Updated last month
chchenhui / mlrbench
View on GitHub
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
☆22Sep 23, 2025Updated 5 months ago
antgroup / OmniBench
View on GitHub
[ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…
☆21Jun 12, 2025Updated 8 months ago
euyis1019 / fragrance-recommendation-dataset
View on GitHub
Personalized Fragrance Recommendation for Aromatherapy: A Machine Learning Approach Based on Personality Traits and Electrodermal Activit…
☆14May 1, 2025Updated 10 months ago
amazon-science / controllable-readability-summarization
View on GitHub
Generating Summaries with Controllable Readability Levels (EMNLP 2023)
☆15Aug 6, 2025Updated 7 months ago
eminorhan / llm-memory
View on GitHub
Memory experiments with LLMs
☆10Mar 31, 2023Updated 2 years ago
MJ-Jang / BECEL
View on GitHub
☆10Jan 28, 2024Updated 2 years ago
vyomakesh09 / longagent
View on GitHub
LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration
☆11Mar 11, 2024Updated last year