USC-GVL / PhysBench
[ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding>
☆44Updated 3 weeks ago
Alternatives and similar repositories for PhysBench:
Users that are interested in PhysBench are comparing it to the libraries listed below
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆44Updated 3 months ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆81Updated this week
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆37Updated 3 months ago
- ☆67Updated 6 months ago
- ☆17Updated 5 months ago
- ☆75Updated 7 months ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆51Updated 3 months ago
- ☆122Updated 2 months ago
- Evaluate Multimodal LLMs as Embodied Agents☆39Updated last month
- [CVPR 2025] 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs☆36Updated 9 months ago
- IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos☆37Updated 3 months ago
- A paper list that includes world models or generative video models for embodied agents.☆19Updated 2 months ago
- Unifying 2D and 3D Vision-Language Understanding☆49Updated last week
- ☆30Updated this week
- Agent-to-Sim Learning Interactive Behavior from Casual Videos.☆42Updated 5 months ago
- ☆85Updated 3 weeks ago
- ☆49Updated this week
- ☆32Updated last week
- The PyTorch implementation of paper: "AdaWorld: Learning Adaptable World Models with Latent Actions".☆36Updated this week
- [NeurIPS 2024] Official code repository for MSR3D paper☆44Updated 3 weeks ago
- [ICLR 2025] Official Implementation of M3: 3D-Spatial Multimodal Memory☆118Updated last week
- This repository is a collection of research papers on World Models.☆37Updated last year
- ☆59Updated last week
- ☆21Updated 2 months ago
- [arXiv 2024] The official repository of the paper "Unsupervised Discovery of Object-Centric Neural Fields"☆17Updated last month
- Program synthesis for 3D spatial reasoning☆24Updated last month
- ☆46Updated 3 months ago
- Official implementation of "Self-Improving Video Generation"☆62Updated 3 weeks ago
- Code for Stable Control Representations☆24Updated 3 months ago
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆98Updated 4 months ago