ZiyuGuo99 / MME-CoFView external linksLinks
Are Video Models Ready as Zero-shot Reasoners?
☆84Nov 24, 2025Updated 2 months ago
Alternatives and similar repositories for MME-CoF
Users that are interested in MME-CoF are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆36Oct 29, 2025Updated 3 months ago
- aFun 编程语言☆12Feb 23, 2022Updated 3 years ago
- CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation☆33Jan 29, 2026Updated 2 weeks ago
- [ICCV2025] The official code of "DreamRelation: Relation-Centric Video Customization"☆27Feb 4, 2026Updated last week
- ☆28Dec 18, 2025Updated last month
- ☆34Oct 29, 2025Updated 3 months ago
- Just wanna see what type and how many GPUs/TPUs are used in CVPR 2025 oral papers. Fun vibe coding with LLMs.☆12Apr 24, 2025Updated 9 months ago
- Reinforcing Text-Rich Video Reasoning with Visual Rumination☆27Nov 24, 2025Updated 2 months ago
- ☆21Dec 10, 2025Updated 2 months ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆29Oct 5, 2025Updated 4 months ago
- Official pytorch implementation of "ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding"☆17Dec 17, 2025Updated last month
- Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"☆13Jun 17, 2024Updated last year
- InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion☆79Dec 27, 2025Updated last month
- Official implementation of "MV-TAP: Tracking Any Point in Multi-View Videos"☆38Dec 7, 2025Updated 2 months ago
- 4RC: 4D Reconstruction via Conditional Querying Anytime and Anywhere☆43Updated this week
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…☆121Jan 29, 2026Updated 2 weeks ago
- ☆40Dec 6, 2025Updated 2 months ago
- WMS庫存管理系統是一個基於PHP的Web應用程序,旨在幫助中小型企業管理其庫存、供應商和產品信息。該系統提供了直觀的用戶界面和強大的功能,使庫存管理變得簡單高效。☆20Feb 18, 2025Updated 11 months ago
- [ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation☆116Oct 7, 2025Updated 4 months ago
- Official implementation of "Repurposing Video Diffusion Transformers for Robust Point Tracking"☆37Dec 24, 2025Updated last month
- ☆64Feb 1, 2026Updated 2 weeks ago
- Code for "StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model", AAAI2026 Oral☆42Jan 16, 2026Updated last month
- CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms☆25Dec 21, 2025Updated last month
- ☆35Dec 16, 2025Updated 2 months ago
- [CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation☆855May 23, 2025Updated 8 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆128Nov 4, 2025Updated 3 months ago
- SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time☆97Jan 1, 2026Updated last month
- ☆48Apr 14, 2025Updated 10 months ago
- 「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆82Jun 13, 2025Updated 8 months ago
- ☆19Sep 24, 2024Updated last year
- VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs☆48Jan 5, 2026Updated last month
- Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"☆19Feb 4, 2025Updated last year
- Official Implementation of "Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation"☆45Jan 29, 2026Updated 2 weeks ago
- ☆53Dec 10, 2025Updated 2 months ago
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆33Dec 2, 2025Updated 2 months ago
- Official implementation of Log-linear Sparse Attention (LLSA).☆56Feb 2, 2026Updated last week
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆101Sep 19, 2025Updated 4 months ago
- EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]☆121Feb 6, 2026Updated last week
- ☆27Jul 6, 2022Updated 3 years ago