Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.
☆219Oct 12, 2025Updated 5 months ago
Alternatives and similar repositories for MiniVeo3-Reasoner
Users that are interested in MiniVeo3-Reasoner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is a framework for evaluating reasoning in foundational Video Models.☆83Mar 7, 2026Updated 2 weeks ago
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆56Feb 4, 2026Updated last month
- [CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"☆39Nov 26, 2025Updated 4 months ago
- 哈尔滨工业大学2023春季学期编译系统课程实验、习题、课件以及期末复习材料☆11Jul 30, 2023Updated 2 years ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆235Oct 28, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation☆136Jun 10, 2025Updated 9 months ago
- ☆214Dec 19, 2025Updated 3 months ago
- [ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆167Jan 26, 2026Updated 2 months ago
- ☆19Jan 26, 2026Updated 2 months ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆85Mar 9, 2026Updated 2 weeks ago
- We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…☆285Updated this week
- Retargeting of whole-body human motion to humanoid robots for dexterous manipulation of articulated objects.☆26Jan 28, 2026Updated last month
- A list of works on video generation towards world model☆433Mar 18, 2026Updated last week
- Code release for paper "Test-Time Training Done Right"☆420Jan 5, 2026Updated 2 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆476Jan 17, 2025Updated last year
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Jan 2, 2026Updated 2 months ago
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆21Dec 2, 2025Updated 3 months ago
- Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)☆702Sep 24, 2025Updated 6 months ago
- ICML2025☆65Aug 28, 2025Updated 6 months ago
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆76May 18, 2025Updated 10 months ago
- [CVPR 2025] TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing☆185May 22, 2025Updated 10 months ago
- Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models☆187Jan 4, 2026Updated 2 months ago
- [ICCV 2025] Video-T1: Test-Time Scaling for Video Generation☆307Mar 7, 2026Updated 2 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text☆53Mar 16, 2025Updated last year
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆54May 8, 2025Updated 10 months ago
- ☆112Jan 8, 2025Updated last year
- Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence☆305Mar 2, 2026Updated 3 weeks ago
- A collection of awesome think with videos papers.☆95Dec 1, 2025Updated 3 months ago
- AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation☆37Feb 23, 2026Updated last month
- official repo for `thinking with images through-self-calling`☆25Dec 28, 2025Updated 2 months ago
- ☆66Feb 4, 2026Updated last month
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Dec 12, 2025Updated 3 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆37Feb 4, 2026Updated last month
- logit lens for VGGT☆27Dec 2, 2025Updated 3 months ago
- The official code of Yume☆635Jan 14, 2026Updated 2 months ago
- [CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding☆45Mar 16, 2026Updated last week
- This repository contains the code for the paper - "Aligning Text, Images, and 3D Structure Token-by-Token" (CVPR 2026)☆44Jun 11, 2025Updated 9 months ago
- ☆18Aug 21, 2025Updated 7 months ago
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year