eloialonso / diamond
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
☆1,731Updated 2 months ago
Alternatives and similar repositories for diamond:
Users that are interested in diamond are comparing it to the libraries listed below
- Inference script for Oasis 500M☆1,741Updated 3 months ago
- Official repository for our work on micro-budget training of large-scale diffusion models.☆1,246Updated last month
- Unifying 3D Mesh Generation with Language Models☆921Updated 2 months ago
- Code of Pyramidal Flow Matching for Efficient Video Generative Modeling☆2,780Updated last month
- SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement☆1,367Updated 3 weeks ago
- The first behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.☆536Updated this week
- The best OSS video generation models☆2,915Updated last month
- A blender addon for generating meshes with AI☆467Updated last month
- FastVideo is a lightweight framework for accelerating large video diffusion models.☆1,095Updated this week
- A suite of image and video neural tokenizers☆1,558Updated last week
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆3,386Updated last week
- DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion☆1,192Updated 2 months ago
- SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images☆699Updated last week
- Official repository for LTX-Video☆2,857Updated this week
- Code release for https://kovenyu.com/WonderWorld/☆437Updated last month
- ☆242Updated this week
- Unified framework for robot learning built on NVIDIA Isaac Sim☆2,841Updated this week
- Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)☆841Updated 5 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆2,916Updated last week
- A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes…☆1,986Updated last month
- Text-to-Music Generation with Rectified Flow Transformers☆1,667Updated 2 months ago
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,156Updated this week
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,930Updated 6 months ago
- Motion-Controllable Video Diffusion via Warped Noise☆745Updated this week
- A general fine-tuning kit geared toward diffusion models.☆2,092Updated last week
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.☆4,122Updated 4 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,809Updated 3 months ago
- ☆778Updated 3 weeks ago