nvidia-cosmos / cosmos-reason1Links

Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

☆799

Alternatives and similar repositories for cosmos-reason1

Users that are interested in cosmos-reason1 are comparing it to the libraries listed below

Sorting:

nvidia-cosmos / cosmos-predict1
Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…
☆378Updated 3 months ago
nvidia-cosmos / cosmos-transfer1
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environment…
☆729Updated 3 weeks ago
lucidrains / pi-zero-pytorch
Implementation of π₀, the robotic foundation model architecture proposed by Physical Intelligence
☆522Updated 3 months ago
nvidia-cosmos / cosmos-predict2
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…
☆670Updated 3 weeks ago
NVIDIA / GR00T-Dreams
Nvidia GEAR Lab's initiative to solve the robotics data problem using world models
☆380Updated 3 weeks ago
alibaba-damo-academy / WorldVLA
WorldVLA: Towards Autoregressive Action World Model
☆539Updated last month
vision-x-nyu / thinking-in-space
Official repo and evaluation implementation of VSI-Bench
☆631Updated 3 months ago
OpenDriveLab / UniVLA
[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions
☆839Updated this week
TRI-ML / prismatic-vlms
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
☆861Updated last year
LatentActionPretraining / LAPA
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆403Updated 10 months ago
1x-technologies / 1xgpt
world modeling challenge for humanoid robots
☆522Updated last year
myscience / open-genie
Pytorch implementation of "Genie: Generative Interactive Environments", Bruce et al. (2024).
☆226Updated last year
embodiedreasoning / ERQA
Embodied Reasoning Question Answer (ERQA) Benchmark
☆241Updated 8 months ago
unitreerobotics / unifolm-world-model-action
☆702Updated last month
leofan90 / Awesome-World-Models
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and A…
☆743Updated this week
knightnemo / Awesome-World-Models
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts…
☆782Updated last week
facebookresearch / jepa-intuitive-physics
This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"
☆196Updated 9 months ago
FlagOpen / RoboBrain2.0
RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. 🎉🎉🎉
☆691Updated this week
PRIME-RL / SimpleVLA-RL
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
☆987Updated last month
allenai / molmo
Code for the Molmo Vision-Language Model
☆808Updated 11 months ago
remyxai / VQASynth
Compose multimodal datasets 🎹
☆507Updated 3 months ago
simpler-env / SimplerEnv
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Goo…
☆842Updated 7 months ago
UMass-Embodied-AGI / 3D-VLA
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
☆593Updated last year
allenzren / open-pi-zero
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
☆1,251Updated 9 months ago
facebookresearch / vjepa2
PyTorch code and models for VJEPA2 self-supervised learning from video.
☆2,438Updated 2 months ago
SpatialVLA / SpatialVLA
🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.
☆570Updated 5 months ago
Robot-VLAs / RoboVLMs
☆411Updated 9 months ago
thu-ml / RDT2
Official code of RDT 2
☆567Updated last month
gaoyuezhou / dino_wm
☆312Updated 7 months ago
Stanford-ILIAD / openvla-mini
OpenVLA: An open-source vision-language-action model for robotic manipulation.
☆293Updated 8 months ago