NVIDIA / Cosmos
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. Cosmos is purpose built for physical AI. The Cosmos repository will enable end users to run the Cosmos models, run inference scripts and generate vide…
☆7,688Updated last week
Alternatives and similar repositories for Cosmos:
Users that are interested in Cosmos are comparing it to the libraries listed below
- High-resolution models for human tasks.☆4,887Updated 4 months ago
- A suite of image and video neural tokenizers☆1,575Updated last month
- A generative world for general-purpose robotics & embodied AI learning.☆24,356Updated this week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,006Updated last week
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆14,522Updated 2 months ago
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆8,807Updated last week
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆9,231Updated this week
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆3,696Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆11,970Updated this week
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,348Updated this week
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆4,549Updated 2 weeks ago
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"☆6,606Updated last month
- verl: Volcano Engine Reinforcement Learning for LLMs☆4,847Updated this week
- 🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning☆10,373Updated this week
- Sky-T1: Train your own O1 preview model within $450☆3,123Updated this week
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆2,227Updated 2 weeks ago
- The best OSS video generation models☆3,013Updated 2 months ago