eloialonso / diamondLinks
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
☆1,905Updated 11 months ago
Alternatives and similar repositories for diamond
Users that are interested in diamond are comparing it to the libraries listed below
Sorting:
- Inference script for Oasis 500M☆1,982Updated last year
- A minimal implementation of DeepMind's Genie world model☆1,042Updated last week
- A suite of image and video neural tokenizers☆1,686Updated 9 months ago
- PyTorch code and models for VJEPA2 self-supervised learning from video.☆2,477Updated 3 months ago
- A unified inference and post-training framework for accelerated video generation.☆2,659Updated this week
- The best OSS video generation models, created by Genmo☆3,515Updated 2 weeks ago
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,527Updated 10 months ago
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆4,761Updated this week
- MineWorld: A Real-time interactive world model on Minecraft☆415Updated 3 months ago
- Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model☆2,461Updated last month
- code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"☆1,084Updated 3 weeks ago
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,530Updated last month
- The official implementation of CVPR'25 Oral paper "Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped No…☆1,045Updated last month
- A general fine-tuning kit geared toward image/video/audio diffusion models.☆2,607Updated last week
- Scalable and memory-optimized training of diffusion models☆1,306Updated 5 months ago
- ☆1,151Updated last year
- Automating the Search for Artificial Life with Foundation Models!☆444Updated last month
- Stable Virtual Camera: Generative View Synthesis with Diffusion Models☆1,501Updated 5 months ago
- [ICCV'25]DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion☆1,312Updated last month
- [ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling☆3,124Updated 11 months ago
- Minimal implementation of scalable rectified flow transformers, based on SD3's approach☆620Updated last year
- The first behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.☆689Updated 5 months ago
- ☆271Updated 2 months ago
- ☆2,226Updated last year
- [NeurIPS 2025] MMaDA - Open-Sourced Multimodal Large Diffusion Language Models☆1,504Updated 2 weeks ago
- Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.☆1,380Updated last month
- Unifying 3D Mesh Generation with Language Models☆1,125Updated 8 months ago
- ☆316Updated 6 months ago
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,900Updated last year
- 4M: Massively Multimodal Masked Modeling☆1,773Updated 5 months ago