Official code for MotionBench (CVPR 2025)
☆70Mar 3, 2025Updated last year
Alternatives and similar repositories for MotionBench
Users that are interested in MotionBench are comparing it to the libraries listed below
Sorting:
- [ICLR 2026] MotionSight's official code implementation.☆47Feb 13, 2026Updated last month
- ☆11Aug 4, 2024Updated last year
- ☆20Oct 15, 2025Updated 5 months ago
- ☆14Sep 11, 2025Updated 6 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆51Jun 12, 2025Updated 9 months ago
- ☆14Jun 2, 2025Updated 9 months ago
- Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design☆25Jan 9, 2026Updated 2 months ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆47Jul 1, 2025Updated 8 months ago
- KMM: Key Frame Mask Mamba for Extended Motion Generation☆19Sep 22, 2025Updated 6 months ago
- ☆48Nov 1, 2024Updated last year
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- ☆20Nov 21, 2025Updated 4 months ago
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆38Nov 10, 2024Updated last year
- Soft-QMIX: Integrating Maximum Entropy For Monotonic Value Function Factorization☆15Jul 3, 2024Updated last year
- [CVPR 2025] Official implementation of the paper "SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction"☆47Dec 11, 2025Updated 3 months ago
- Code for "CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally"☆25Feb 27, 2026Updated 3 weeks ago
- Retargeting of the 100STYLE dataset onto a common skeleton☆36Sep 16, 2025Updated 6 months ago
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆137Jul 28, 2025Updated 7 months ago
- Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models☆12Nov 1, 2025Updated 4 months ago
- [ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding☆23Feb 26, 2025Updated last year
- ☆37Nov 8, 2024Updated last year
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆81Jan 5, 2026Updated 2 months ago
- This is the official repository of CVPR 2025 Paper: Dynamic Motion Blending for Versatile Motion Editing.☆48Mar 29, 2025Updated 11 months ago
- A Decade of Action Quality Assessment: Largest Systematic Survey of Trends, Challenges, and Future Directions☆15Jan 22, 2026Updated 2 months ago
- Fast, memory-efficient attention column reduction (e.g., sum, mean, max)☆42Feb 10, 2026Updated last month
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- Scaling Motion Generation Model with Million-Level Human Motions (ICML 2025)☆69May 14, 2025Updated 10 months ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆25Dec 14, 2025Updated 3 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆42Feb 12, 2025Updated last year
- Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence☆299Mar 2, 2026Updated 3 weeks ago
- Official code for the paper "Understanding Co-speech Gestures in-the-wild"☆20Oct 31, 2025Updated 4 months ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- ☆25Nov 17, 2025Updated 4 months ago
- Retargeting of the ZeroEGGs dataset onto a common character☆40Sep 16, 2025Updated 6 months ago
- Transactions on Multimedia (TMM25)☆19Apr 8, 2025Updated 11 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆46Dec 1, 2024Updated last year
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 4 months ago
- Official Implementation of AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis with the extension (…☆21Apr 19, 2024Updated last year
- [ICLR 2025] Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation☆45Mar 13, 2025Updated last year