facebookresearch / IntPhys2Links
This is the code repository for IntPhys 2, a video benchmark designed to evaluate the intuitive physics understanding of deep learning models.
☆68Updated 2 months ago
Alternatives and similar repositories for IntPhys2
Users that are interested in IntPhys2 are comparing it to the libraries listed below
Sorting:
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆176Updated 6 months ago
- Pytorch implementation of "Genie: Generative Interactive Environments", Bruce et al. (2024).☆193Updated last year
- Clarity: A Minimalist Website Template for AI Research☆132Updated 7 months ago
- Benchmarking physical understanding in generative video models☆192Updated 3 months ago
- [ICML'25] The PyTorch implementation of paper: "AdaWorld: Learning Adaptable World Models with Latent Actions".☆147Updated 2 months ago
- Theia: Distilling Diverse Vision Foundation Models for Robot Learning☆248Updated 4 months ago
- Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…☆527Updated this week
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy☆223Updated 5 months ago
- Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…☆331Updated last week
- Nvidia GEAR Lab's initiative to solve the robotics data problem using world models☆289Updated last week
- ☆238Updated 5 months ago
- ☆121Updated 6 months ago
- A Video Tokenizer Evaluation Dataset☆130Updated 7 months ago
- Generative World Explorer☆154Updated 2 months ago
- Official Repository for MolmoAct☆109Updated last week
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆141Updated 3 months ago
- Scaling Vision Pre-Training to 4K Resolution☆200Updated this week
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆83Updated 3 months ago
- Official implementation of the paper "EgoPet: Egomotion and Interaction Data from an Animal's Perspective".☆27Updated last year
- ☆78Updated 3 months ago
- Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".☆222Updated 5 months ago
- Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long c…☆664Updated last week
- ElasticTok: Adaptive Tokenization for Image and Video☆75Updated 9 months ago
- Implementation of the Large Behavioral Model architecture for dexterous manipulation from Toyota Research Institute☆53Updated this week
- ☆38Updated 6 months ago
- ☆52Updated last month
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆114Updated last month
- ☆136Updated 7 months ago
- Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆79Updated last month
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆68Updated 3 months ago