NVIDIA / Cosmos
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. Cosmos is purpose built for physical AI. The Cosmos repository will enable end users to run the Cosmos models, run inference scripts and generate vide…
☆6,946Updated last week
Alternatives and similar repositories for Cosmos:
Users that are interested in Cosmos are comparing it to the libraries listed below
- A suite of image and video neural tokenizers☆1,478Updated this week
- A generative world for general-purpose robotics & embodied AI learning.☆22,786Updated this week
- The best OSS video generation models☆2,718Updated last week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆2,752Updated last week
- Code of Pyramidal Flow Matching for Efficient Video Generative Modeling☆2,701Updated 3 weeks ago
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"☆6,333Updated last week
- High-resolution models for human tasks.☆4,763Updated last month
- ☆19,214Updated last week
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆2,407Updated this week
- Composable building blocks to build Llama Apps☆6,036Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,462Updated this week
- PyTorch code and models for V-JEPA self-supervised learning from video.☆2,745Updated 5 months ago
- Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".☆6,783Updated 3 weeks ago
- DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 S…☆1,691Updated last month
- Sky-T1: Train your own O1 preview model within $450☆1,795Updated this week
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆1,725Updated last month
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆13,651Updated 3 weeks ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,217Updated last month
- Efficient Triton Kernels for LLM Training☆4,183Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,326Updated 6 months ago
- World's First Large-scale High-quality Robotic Manipulation Benchmark☆1,189Updated last week
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,147Updated last month
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆7,425Updated this week
- 🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning☆8,228Updated this week
- A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes…☆1,774Updated 2 weeks ago
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆4,010Updated 3 months ago
- LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning☆1,739Updated last week
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆6,716Updated 7 months ago
- Official repository for LTX-Video☆2,562Updated 2 weeks ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,908Updated 5 months ago