Are Video Models Ready as Zero-shot Reasoners?
☆86Nov 24, 2025Updated 4 months ago
Alternatives and similar repositories for MME-CoF
Users that are interested in MME-CoF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆22Dec 2, 2025Updated 4 months ago
- 🔥 A continuously updated collection of papers, datasets, and benchmarks on post-training and alignment for video generation.☆110Apr 11, 2026Updated last week
- The first Interleaved framework for textual reasoning within the visual generation process☆159Mar 16, 2026Updated last month
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆37Oct 29, 2025Updated 5 months ago
- ☆19Sep 24, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆103Sep 19, 2025Updated 7 months ago
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆42Jan 29, 2026Updated 2 months ago
- Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"☆19Feb 4, 2025Updated last year
- [CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation☆860Mar 19, 2026Updated 3 weeks ago
- ☆36Dec 18, 2025Updated 4 months ago
- This is a framework for evaluating reasoning in foundational Video Models.☆88Apr 1, 2026Updated 2 weeks ago
- Just wanna see what type and how many GPUs/TPUs are used in CVPR 2025 oral papers. Fun vibe coding with LLMs.☆12Apr 24, 2025Updated 11 months ago
- 「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆83Jun 13, 2025Updated 10 months ago
- ☆57Oct 17, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This repository shares undergraduate course materials for the Electronic Information Engineering program at the University of Science and…☆66Mar 10, 2026Updated last month
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆23Feb 13, 2026Updated 2 months ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆87Mar 9, 2026Updated last month
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆176Apr 28, 2025Updated 11 months ago
- InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion☆86Dec 27, 2025Updated 3 months ago
- Official implementation of "Repurposing Video Diffusion Transformers for Robust Point Tracking"☆42Dec 24, 2025Updated 3 months ago
- WMS庫存管理系統是一個基於PHP的Web應用程序,旨在幫助中小型企業管理其庫存、供應商和產品信息。該系統提供了直觀的用戶界面和強大的功能,使庫存管理變得簡單高效。☆21Feb 18, 2025Updated last year
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆40Dec 2, 2025Updated 4 months ago
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆137Aug 5, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆15Mar 18, 2025Updated last year
- This is a collection of recent papers on reasoning in video generation models.☆150Mar 30, 2026Updated 2 weeks ago
- [ICML 2025] EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM☆72Jul 16, 2025Updated 9 months ago
- Mirage: One-Step Video Diffusion for Photorealistic and Coherent Asset Editing in Driving Scenes☆27Mar 12, 2026Updated last month
- ☆48Apr 14, 2025Updated last year
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆49Feb 10, 2026Updated 2 months ago
- ☆86Oct 10, 2025Updated 6 months ago
- open-sourced video dataset with dynamic scenes and camera movements annotation☆91Apr 24, 2025Updated 11 months ago
- ☆60Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- 📷 [CVPR'26] Camera-controlled text-to-video generation, now with intrinsics, distortion and orientation control!☆151Apr 12, 2026Updated last week
- A low-cost avatar system☆67Nov 13, 2025Updated 5 months ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆30Oct 5, 2025Updated 6 months ago
- [NIPS 2025] FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens☆22Oct 12, 2025Updated 6 months ago
- ☆11Nov 21, 2022Updated 3 years ago
- [NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT☆431Sep 18, 2025Updated 7 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆25Jan 14, 2026Updated 3 months ago