NVIDIA / Cosmos
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. Cosmos is purpose built for physical AI. The Cosmos repository will enable end users to run the Cosmos models, run inference scripts and generate vide…
☆7,540Updated last week
Alternatives and similar repositories for Cosmos:
Users that are interested in Cosmos are comparing it to the libraries listed below
- A suite of image and video neural tokenizers☆1,558Updated last week
- A generative world for general-purpose robotics & embodied AI learning.☆23,918Updated this week
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"☆6,512Updated this week
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆1,951Updated 2 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆2,916Updated last week
- ☆2,197Updated this week
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆3,827Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆7,745Updated this week
- 🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.☆11,162Updated this week
- Sky-T1: Train your own O1 preview model within $450☆2,641Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models