eloialonso / diamond
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
☆1,804Updated 5 months ago
Alternatives and similar repositories for diamond:
Users that are interested in diamond are comparing it to the libraries listed below
- Inference script for Oasis 500M☆1,820Updated 6 months ago
- A suite of image and video neural tokenizers☆1,621Updated 2 months ago
- The best OSS video generation models☆3,138Updated 4 months ago
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆4,098Updated this week
- FastVideo is a lightweight framework for accelerating large video diffusion models.☆1,366Updated this week
- Mastering Diverse Domains through World Models☆1,813Updated 3 weeks ago
- Memory-optimized training library for diffusion models☆1,120Updated this week
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,735Updated 8 months ago
- DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion☆1,233Updated 5 months ago
- Stable Virtual Camera: Generative View Synthesis with Diffusion Models☆1,232Updated last week
- ☆3,189Updated 2 weeks ago
- Codebase for Aria - an Open Multimodal Native MoE☆1,033Updated 3 months ago
- SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images☆778Updated 2 months ago
- New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos☆7,945Updated last week
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,623Updated this week
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,397Updated 3 months ago
- The first behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.☆587Updated 2 months ago
- code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"☆843Updated last month
- ☆1,153Updated 4 months ago
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,070Updated 6 months ago
- Automating the Search for Artificial Life with Foundation Models!☆409Updated 3 months ago
- NanoGPT (124M) in 3 minutes☆2,520Updated last week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,201Updated last week
- Official PyTorch implementation of One-Minute Video Generation with Test-Time Training☆1,483Updated 3 weeks ago
- VideoSys: An easy and efficient system for video generation☆1,960Updated 2 months ago
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,183Updated 2 months ago
- Motion-Controllable Video Diffusion via Warped Noise☆879Updated last month
- Simple and readable code for training and sampling from diffusion models☆487Updated 4 months ago
- CVPR2025☆849Updated last week
- ☆1,001Updated 6 months ago