facebookresearch / IntPhys2Links
This is the code repository for IntPhys 2, a video benchmark designed to evaluate the intuitive physics understanding of deep learning models.
☆81Updated 2 weeks ago
Alternatives and similar repositories for IntPhys2
Users that are interested in IntPhys2 are comparing it to the libraries listed below
Sorting:
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆193Updated 8 months ago
- Pytorch implementation of "Genie: Generative Interactive Environments", Bruce et al. (2024).☆219Updated last year
- [ICML'25] The PyTorch implementation of paper: "AdaWorld: Learning Adaptable World Models with Latent Actions".☆170Updated 4 months ago
- ☆297Updated 7 months ago
- Implementation of Danijar's latest iteration for his Dreamer line of work☆99Updated this week
- Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the …☆318Updated this week
- Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…☆655Updated last week
- Nvidia GEAR Lab's initiative to solve the robotics data problem using world models☆358Updated 2 weeks ago
- Theia: Distilling Diverse Vision Foundation Models for Robot Learning☆257Updated this week
- Official Repository for MolmoAct☆244Updated 2 weeks ago
- Clarity: A Minimalist Website Template for AI Research☆162Updated 9 months ago
- Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…☆374Updated 2 months ago
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆156Updated last month
- ☆77Updated 5 months ago
- NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks☆183Updated 3 months ago
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy☆225Updated 7 months ago
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆85Updated 5 months ago
- Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation☆88Updated 3 months ago
- Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.☆207Updated this week
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆398Updated 9 months ago
- Embodied Reasoning Question Answer (ERQA) Benchmark☆238Updated 7 months ago
- Generative World Explorer☆159Updated 4 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆75Updated 5 months ago
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆280Updated 7 months ago
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆85Updated 5 months ago
- A Video Tokenizer Evaluation Dataset☆137Updated 9 months ago
- ☆150Updated 10 months ago
- Implementation of the Large Behavioral Model architecture for dexterous manipulation from Toyota Research Institute☆66Updated last month
- ☆122Updated 8 months ago
- A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts…☆702Updated this week