Are Video Models Ready as Zero-shot Reasoners?
☆87Nov 24, 2025Updated 6 months ago
Alternatives and similar repositories for MME-CoF
Users that are interested in MME-CoF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆24Dec 2, 2025Updated 5 months ago
- 🔥 A continuously updated collection of papers, datasets, and benchmarks on post-training and alignment for video generation.☆142Apr 13, 2026Updated last month
- The first Interleaved framework for textual reasoning within the visual generation process☆160Mar 16, 2026Updated 2 months ago
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆42Oct 29, 2025Updated 7 months ago
- [ICCV2025] The official code of "DreamRelation: Relation-Centric Video Customization"☆26Feb 4, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆19Sep 24, 2024Updated last year
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆106Sep 19, 2025Updated 8 months ago
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆43Jan 29, 2026Updated 4 months ago
- CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms☆25Dec 21, 2025Updated 5 months ago
- Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"☆19Feb 4, 2025Updated last year
- [CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation☆865Mar 19, 2026Updated 2 months ago
- Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"☆13Jun 17, 2024Updated last year
- ☆38Dec 18, 2025Updated 5 months ago
- Just wanna see what type and how many GPUs/TPUs are used in CVPR 2025 oral papers. Fun vibe coding with LLMs.☆12Apr 24, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆84Jun 13, 2025Updated 11 months ago
- ☆57Oct 17, 2021Updated 4 years ago
- Implementation of D4RT, Efficiently Reconstructing Dynamic Scenes, from Deepmind☆65May 20, 2026Updated last week
- This repository shares undergraduate course materials for the Electronic Information Engineering program at the University of Science and…☆68Mar 10, 2026Updated 2 months ago
- [NeurIPS 2025] Scaling Language-centric Omnimodal Representation Learning☆42Apr 13, 2026Updated last month
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆25Apr 13, 2026Updated last month
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆96Mar 9, 2026Updated 2 months ago
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆181Apr 28, 2025Updated last year
- InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion☆89Dec 27, 2025Updated 5 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official implementation of "Repurposing Video Diffusion Transformers for Robust Point Tracking"☆44Dec 24, 2025Updated 5 months ago
- WMS庫存管理系統是一個基於PHP的Web應用程序,旨在幫助中小型企業管理其庫存、供應商和產品信息。該系統提供了直觀的用戶界面和強大的功能,使庫存管理變得簡單高效。☆21Feb 18, 2025Updated last year
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆137Aug 5, 2025Updated 9 months ago
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆43Dec 2, 2025Updated 5 months ago
- ☆72Feb 1, 2026Updated 3 months ago
- This is a collection of recent papers on reasoning in video generation models.☆154May 13, 2026Updated 2 weeks ago
- Official implementation of "MV-TAP: Tracking Any Point in Multi-View Videos"☆43Mar 10, 2026Updated 2 months ago
- [ICML 2025] EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM☆72Jul 16, 2025Updated 10 months ago
- ☆48Apr 14, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆87Oct 10, 2025Updated 7 months ago
- open-sourced video dataset with dynamic scenes and camera movements annotation☆92Apr 24, 2025Updated last year
- A low-cost avatar system☆68Nov 13, 2025Updated 6 months ago
- [NIPS 2025] FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens☆21Oct 12, 2025Updated 7 months ago
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆63Updated this week
- ☆11Nov 21, 2022Updated 3 years ago
- [NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT☆432Sep 18, 2025Updated 8 months ago