NVIDIA / CosmosLinks
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
☆8,061Updated 3 months ago
Alternatives and similar repositories for Cosmos
Users that are interested in Cosmos are comparing it to the libraries listed below
Sorting:
- A suite of image and video neural tokenizers☆1,671Updated 7 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,574Updated 2 months ago
- A generative world for general-purpose robotics & embodied AI learning.☆27,329Updated this week
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"☆6,954Updated 6 months ago
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆4,536Updated this week
- Minimal reproduction of DeepSeek R1-Zero☆12,236Updated 5 months ago
- [NeurIPS 2025] SpatialLM: Training Large Language Models for Structured Indoor Modeling☆4,014Updated last week
- NVIDIA Isaac GR00T N1.5 - A Foundation Model for Generalist Robots.☆4,994Updated last week
- PyTorch code and models for VJEPA2 self-supervised learning from video.☆2,269Updated last month
- High-resolution models for human tasks.☆5,163Updated 10 months ago
- MAGI-1: Autoregressive Video Generation at Scale☆3,499Updated 3 months ago
- ☆3,468Updated 7 months ago
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,817Updated this week
- Reference PyTorch implementation and models for DINOv3☆7,517Updated this week
- PyTorch code and models for V-JEPA self-supervised learning from video.☆3,214Updated 7 months ago
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆3,990Updated 6 months ago
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆11,111Updated last month
- [ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning☆2,072Updated last month
- The best OSS video generation models, created by Genmo☆3,444Updated last month
- 4M: Massively Multimodal Masked Modeling☆1,765Updated 4 months ago
- Fully open reproduction of DeepSeek-R1☆25,507Updated last month
- Witness the aha moment of VLM with less than $3.☆3,950Updated 4 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆2,054Updated last year
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆4,085Updated 3 weeks ago
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,512Updated 8 months ago
- Utilities intended for use with Llama models.☆7,278Updated 2 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,996Updated last week
- DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 S…☆1,872Updated 10 months ago
- [IROS 2025 Award Finalist] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems☆2,474Updated last week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆17,565Updated 8 months ago