NVIDIA / CosmosLinks
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
☆8,028Updated 3 weeks ago
Alternatives and similar repositories for Cosmos
Users that are interested in Cosmos are comparing it to the libraries listed below
Sorting:
- Janus-Series: Unified Multimodal Understanding and Generation Models☆17,411Updated 5 months ago
- NVIDIA Isaac GR00T N1.5 is the world's first open foundation model for generalized humanoid robot reasoning and skills.☆4,272Updated last week
- SpatialLM: Training Large Language Models for Structured Indoor Modeling☆3,424Updated last week
- A suite of image and video neural tokenizers☆1,637Updated 4 months ago
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆11,207Updated last month
- PyTorch code and models for V-JEPA self-supervised learning from video.☆3,102Updated 4 months ago
- ☆3,805Updated this week
- OpenVLA: An open-source vision-language-action model for robotic manipulation.☆3,122Updated 3 months ago
- Sky-T1: Train your own O1 preview model within $450☆3,286Updated last month
- PyTorch code and models for VJEPA2 self-supervised learning from video.☆1,650Updated this week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,356Updated last week
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆10,489Updated 3 weeks ago
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆4,593Updated 2 months ago
- A generative world for general-purpose robotics & embodied AI learning.☆25,357Updated this week
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆15,974Updated 6 months ago
- s1: Simple test-time scaling☆6,468Updated this week
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,735Updated last month
- This package contains the original 2012 AlexNet code.☆2,658Updated 3 months ago
- ☆3,381Updated 3 months ago
- Minimal reproduction of DeepSeek R1-Zero☆11,942Updated 2 months ago
- High-resolution models for human tasks.☆5,056Updated 7 months ago
- This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025☆4,267Updated last month
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆4,300Updated 3 weeks ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,435Updated 2 weeks ago
- An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large…☆16,837Updated 3 weeks ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆10,204Updated this week
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆3,900Updated last year
- Unified framework for robot learning built on NVIDIA Isaac Sim☆4,028Updated this week
- Everything about the SmolLM2 and SmolVLM family of models☆2,606Updated this week
- Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environment…☆521Updated this week