Are Video Models Ready as Zero-shot Reasoners?
☆87Nov 24, 2025Updated 5 months ago
Alternatives and similar repositories for MME-CoF
Users that are interested in MME-CoF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆23Dec 2, 2025Updated 5 months ago
- 🔥 A continuously updated collection of papers, datasets, and benchmarks on post-training and alignment for video generation.☆130Apr 13, 2026Updated 3 weeks ago
- The first Interleaved framework for textual reasoning within the visual generation process☆159Mar 16, 2026Updated last month
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆41Oct 29, 2025Updated 6 months ago
- [ICCV2025] The official code of "DreamRelation: Relation-Centric Video Customization"☆28Feb 4, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆19Sep 24, 2024Updated last year
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆106Sep 19, 2025Updated 7 months ago
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆42Jan 29, 2026Updated 3 months ago
- CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms☆25Dec 21, 2025Updated 4 months ago
- Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"☆19Feb 4, 2025Updated last year
- [CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation☆863Mar 19, 2026Updated last month
- Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"☆13Jun 17, 2024Updated last year
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…☆128Jan 29, 2026Updated 3 months ago
- ☆37Dec 18, 2025Updated 4 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Just wanna see what type and how many GPUs/TPUs are used in CVPR 2025 oral papers. Fun vibe coding with LLMs.☆12Apr 24, 2025Updated last year
- This is a framework for evaluating reasoning in foundational Video Models.☆89Apr 30, 2026Updated last week
- 「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆84Jun 13, 2025Updated 10 months ago
- ☆57Oct 17, 2021Updated 4 years ago
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆24Apr 13, 2026Updated 3 weeks ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆88Mar 9, 2026Updated 2 months ago
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆177Apr 28, 2025Updated last year
- InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion☆88Dec 27, 2025Updated 4 months ago
- Official implementation of "Repurposing Video Diffusion Transformers for Robust Point Tracking"☆43Dec 24, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- WMS庫存管理系統是一個基於PHP的Web應用程序,旨在幫助中小型企業管理其庫存、供應商和產品信息。該系統提供了直觀的用戶界面和強大的功能,使庫存管理變得簡單高效。☆21Feb 18, 2025Updated last year
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆40Dec 2, 2025Updated 5 months ago
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆137Aug 5, 2025Updated 9 months ago
- ☆68Feb 1, 2026Updated 3 months ago
- This is a collection of recent papers on reasoning in video generation models.☆153May 2, 2026Updated last week
- Official implementation of "MV-TAP: Tracking Any Point in Multi-View Videos"☆43Mar 10, 2026Updated last month
- [ICML 2025] EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM☆72Jul 16, 2025Updated 9 months ago
- Mirage: One-Step Video Diffusion for Photorealistic and Coherent Asset Editing in Driving Scenes☆28Mar 12, 2026Updated last month
- ☆48Apr 14, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆86Oct 10, 2025Updated 6 months ago
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆52Feb 10, 2026Updated 2 months ago
- open-sourced video dataset with dynamic scenes and camera movements annotation☆92Apr 24, 2025Updated last year
- A low-cost avatar system☆68Nov 13, 2025Updated 5 months ago
- [NIPS 2025] FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens☆21Oct 12, 2025Updated 6 months ago
- 📷 [CVPR'26] Camera-controlled text-to-video generation, now with intrinsics, distortion and orientation control!☆159Apr 27, 2026Updated last week
- [NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT☆435Sep 18, 2025Updated 7 months ago