tongjingqi/Thinking-with-Video

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tongjingqi/Thinking-with-Video)

tongjingqi / Thinking-with-Video

We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reaches 69% accuracy on MMMU.

☆314

Alternatives and similar repositories for Thinking-with-Video

Users that are interested in Thinking-with-Video are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tongjingqi / AI-Can-Learn-Scientific-Taste
View on GitHub
We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervis…
☆424Jul 13, 2026Updated last week
tongjingqi / Game-RL
View on GitHub
Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
☆156Updated this week
tongjingqi / Awesome-Agent-RL
View on GitHub
A curated list of awesome resources about reward construction for AI agents. This repository covers cutting-edge research, and practical …
☆60Sep 1, 2025Updated 10 months ago
Linxi000 / MEDS
View on GitHub
☆142Jun 24, 2026Updated 3 weeks ago
OpenMOSS / MOSS-VL
View on GitHub
MOSS-VL is the core multimodal model series within the OpenMOSS ecosystem, dedicated to visual understanding.
☆375Updated this week
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
xinghaow99 / prism
View on GitHub
[ICML 2026] Prism: Spectral-Aware Block-Sparse Attention
☆27May 22, 2026Updated last month
Phospheneser / Phospheneser-awesome-academic-template
View on GitHub
An open-source personal academic homepage template characterized by its user-friendly design and extensive scalability.
☆37Oct 6, 2025Updated 9 months ago
thuml / MiniVeo3-Reasoner
View on GitHub
Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…
☆229Apr 13, 2026Updated 3 months ago
ThinkMorph / ThinkMorph
View on GitHub
[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
☆190May 1, 2026Updated 2 months ago
EnVision-Research / TiViBench
View on GitHub
[CVPR 2026] TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models
☆67Feb 21, 2026Updated 4 months ago
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆99Mar 9, 2026Updated 4 months ago
euReKa025 / AgentLongBench
View on GitHub
☆21Jan 29, 2026Updated 5 months ago
hkust-nlp / mstar
View on GitHub
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆75Jul 13, 2025Updated last year
cambrian-mllm / cambrian-s
View on GitHub
Cambrian-S: Towards Spatial Supersensing in Video
☆560Apr 3, 2026Updated 3 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Video-Reason / Awesome-Video-Reasoning
View on GitHub
This is a collection of recent papers on reasoning in video generation models.
☆164Updated this week
ZiyuGuo99 / MME-CoF
View on GitHub
Are Video Models Ready as Zero-shot Reasoners?
☆87Nov 24, 2025Updated 7 months ago
hkust-nlp / Laser
View on GitHub
[ICLR2026] Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
☆66May 22, 2025Updated last year
lcqysl / DiffThinker
View on GitHub
[ICML 2026] Official repo for "DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models"
☆185Jan 4, 2026Updated 6 months ago
OpenMOSS / FutureOmni
View on GitHub
☆26Jan 22, 2026Updated 5 months ago
lcqysl / VideoSSR
View on GitHub
[CVPR 2026] Official repo for "VideoSSR: Video Self-Supervised Reinforcement Learning"
☆41Nov 11, 2025Updated 8 months ago
ssmisya / PolicyShiftGuard
View on GitHub
PolicyShiftGuard: Benchmarking and Improving Policy-Adaptive Image Guardrails
☆21Jul 8, 2026Updated last week
OpenMOSS / MOVA
View on GitHub
MOVA: Towards Scalable and Synchronized Video–Audio Generation
☆1,070Jun 18, 2026Updated last month
Mikivishy / FullFront
View on GitHub
The official code repository for the FullFront benchmark
☆27May 16, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
realZillionX / InspireSkill
View on GitHub
启智平台（qz.sii.edu.cn）的 Agent 驾驶舱：Skill + CLI，一条命令直达。Agent cockpit for the Inspire ML platform — one command, every operation, straight from…
☆180Updated this week
JingYiJun / qz_ssh_starter
View on GitHub
☆21Mar 2, 2026Updated 4 months ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 5 months ago
baaivision / Emu3.5
View on GitHub
Native Multimodal Models are World Learners
☆1,535Dec 30, 2025Updated 6 months ago
thu-ml / Causal-Forcing
View on GitHub
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactiv…
☆864Jul 9, 2026Updated last week
Linzwcs / AutoMusicTheoryQA
View on GitHub
☆22Nov 21, 2025Updated 7 months ago
STARE-bench / STARE
View on GitHub
☆19Oct 12, 2025Updated 9 months ago
ssmisya / PRMBench
View on GitHub
[ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.
☆93Feb 15, 2025Updated last year
zzzhr97 / SpecBench
View on GitHub
Code repository for the ICML 2026 paper "Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Deliberation".
☆24Jun 14, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
huaixuheqing / VPPO-RL
View on GitHub
[ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"
☆69Apr 3, 2026Updated 3 months ago
OpenSparseLLMs / Skip-DiT
View on GitHub
✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
☆80Jul 10, 2025Updated last year
haowei-freesky / HERMES
View on GitHub
Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]
☆92May 8, 2026Updated 2 months ago
sen-ye / R3
View on GitHub
[ICLR26] Understanding VS. Generation: Navigating Optimization Dilemma in Multimodal Models
☆25May 6, 2026Updated 2 months ago
Vchitect / Uni-MMMU
View on GitHub
[ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark
☆25Apr 13, 2026Updated 3 months ago
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
OpenMOSS / RoboOmni
View on GitHub
Official code of "RoboOmni: Proactive Robot Manipulation in Omni-modal Context"
☆116Mar 28, 2026Updated 3 months ago