linjh1118 / Awesome-MLLM-For-GamesLinks

MLLM @ Game

☆14

Alternatives and similar repositories for Awesome-MLLM-For-Games

Users that are interested in Awesome-MLLM-For-Games are comparing it to the libraries listed below

Sorting:

ggg0919 / cantor
☆90Updated last year
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆109Updated 5 months ago
GAIR-NLP / MAYE
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
☆145Updated 7 months ago
SparksJoe / Prism
A Framework for Decoupling and Assessing the Capabilities of VLMs
☆43Updated last year
cnzzx / VSA
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
☆126Updated last year
TheEighthDay / SeekWorld
The first attempt to replicate o3-like visual clue-tracking reasoning capabilities.
☆59Updated 4 months ago
waltonfuture / RL-with-Cold-Start
SFT+RL boosts multimodal reasoning
☆37Updated 4 months ago
hkgc-1 / GHPO
☆52Updated 4 months ago
xmu-xiaoma666 / Multimodal-Open-O1
Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…
☆29Updated last year
ding523 / Curr_REFT
☆72Updated 5 months ago
Kwai-YuanQi / MM-RLHF
The Next Step Forward in Multimodal LLM Alignment
☆186Updated 6 months ago
FudanDISC / ReForm-Eval
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
☆45Updated 2 years ago
LengSicong / MMR1
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
☆208Updated last month
chenllliang / G1
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆88Updated 6 months ago
Kun-Xiang / AtomThink
Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"
☆57Updated 3 months ago
njucckevin / MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
☆86Updated 9 months ago
ZhangXJ199 / TinyLLaVA-Video
A Simple Framework of Small-scale LMMs for Video Understanding
☆96Updated 5 months ago
MiroMindAI / MiroTrain
MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.
☆93Updated 2 months ago
Tencent / digitalhuman
☆171Updated last week
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆348Updated 2 months ago
testtimescaling / testtimescaling.github.io
"what, how, where, and how well? a survey on test-time scaling in large language models" repository
☆77Updated this week
Episoode / Double-Bench
Official Code Repository for Double-Bench
☆24Updated last month
MiroMindAI / MiroMind-M1
MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.
☆241Updated 3 months ago
YanqiDai / MMRole
(ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
☆89Updated 9 months ago
RifleZhang / LLaVA-Reasoner-DPO
☆99Updated 10 months ago
ai-in-pm / rStar-Math
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
☆39Updated 10 months ago
MetabrainAGI / Awaker2.5-VL
☆35Updated 9 months ago
We-Math / We-Math2.0
The code and data of We-Math 2.0.
☆161Updated 2 months ago
microsoft / DELT
DELT: Data Efficacy for Language Model Training
☆42Updated 2 months ago
DAMO-NLP-SG / multimodal_textbook
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆175Updated 8 months ago