nvidia-cosmos / cosmos-rlLinks
Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.
☆150Updated last week
Alternatives and similar repositories for cosmos-rl
Users that are interested in cosmos-rl are comparing it to the libraries listed below
Sorting:
- ☆25Updated last month
- Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)☆615Updated this week
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆152Updated 4 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆88Updated last week
- A Video Tokenizer Evaluation Dataset☆133Updated 8 months ago
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…☆38Updated 2 weeks ago
- Code release for paper "Test-Time Training Done Right"☆283Updated 2 weeks ago
- Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation☆78Updated 2 months ago
- Cosmos-Curate is a powerful video curation system that processes, analyzes, and organizes video content using advanced AI models and dist…☆73Updated last week
- Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…☆355Updated last month
- RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforce…☆428Updated this week
- ☆142Updated 8 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆70Updated 3 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆386Updated 5 months ago
- ☆245Updated 3 months ago
- A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention☆181Updated last month
- Virtual Community: An Open World for Humans, Robots, and Society☆172Updated this week
- FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.☆48Updated last year
- mllm-npu: training multimodal large language models on Ascend NPUs☆92Updated last year
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆80Updated 3 months ago
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆72Updated 3 months ago
- To pioneer training long-context multi-modal transformer models☆58Updated last month
- Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world m…☆592Updated 3 weeks ago
- Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP☆81Updated last month
- Memory Efficient Training Framework for Large Video Generation Model☆25Updated last year
- Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long c…☆707Updated last week
- Code for Draft Attention☆90Updated 4 months ago
- A sparse attention kernel supporting mix sparse patterns☆303Updated 7 months ago
- Megatron's multi-modal data loader☆243Updated 3 weeks ago
- ☆175Updated 8 months ago