Official implementation of "Self-Improving Video Generation"
☆77Apr 25, 2025Updated last year
Alternatives and similar repositories for VideoAgent
Users that are interested in VideoAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration☆62May 4, 2025Updated 11 months ago
- This is the official implementation of SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation.☆117Nov 26, 2024Updated last year
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆26Apr 14, 2025Updated last year
- Official repository for our paper on "Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models"☆13Dec 4, 2023Updated 2 years ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆254Apr 25, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- MCP prompt tool applying Chain-of-Draft (CoD) reasoning - BYOLLM☆19Sep 8, 2025Updated 7 months ago
- Benchmarking physical understanding in generative video models☆286Apr 16, 2026Updated 2 weeks ago
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆314Mar 12, 2025Updated last year
- DiT for VAE (and Video Generation)☆35Sep 2, 2024Updated last year
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆19Apr 11, 2025Updated last year
- We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.☆18Aug 30, 2024Updated last year
- ☆81May 23, 2025Updated 11 months ago
- Code for the paper Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance, accepted to CoRL 2023 as an…☆35Jul 15, 2025Updated 9 months ago
- A custom node extension for ComfyUI that integrates Google's Veo 2 text-to-video generation capabilities.☆32Apr 12, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆513Jan 22, 2025Updated last year
- Code for "Evaluating Robot Policies in a World Model".☆90Nov 6, 2025Updated 5 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆245Oct 28, 2025Updated 6 months ago
- [ICCV 2025] "Fine-grained Spatiotemporal Grounding on Egocentric Videos"☆23Nov 23, 2025Updated 5 months ago
- Official PyTorch implementation of the paper Transformer-Based Image Generation from Scene Graphs https://arxiv.org/abs/2303.04634☆19Jan 30, 2024Updated 2 years ago
- Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"☆207Sep 18, 2025Updated 7 months ago
- [ACM Multimedia 2025 Datasets Track] EditWorld: Simulating World Dynamics for Instruction-Following Image Editing☆140Aug 2, 2025Updated 8 months ago
- Jupyter notebooks for PuLID face transfer with Flux.1 dev. Able to run on Google Colab Free Tier☆18Dec 18, 2024Updated last year
- Code release for: Controllable Layer Decomposition for Reversible Multi-Layer Image Generation☆46Dec 7, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations☆27May 21, 2025Updated 11 months ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆83Dec 12, 2024Updated last year
- Suite of human-collected datasets and a multi-task continuous control benchmark for open vocabulary visuolinguomotor learning.☆356Apr 21, 2026Updated last week
- ☆47Nov 28, 2024Updated last year
- [ICCV 2025] LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal☆28Oct 20, 2025Updated 6 months ago
- ☆89Aug 4, 2025Updated 8 months ago
- Code repository for T2V-Turbo and T2V-Turbo-v2☆313Jan 31, 2025Updated last year
- faster parallel inference of mochi-1 video generation model☆125Feb 25, 2025Updated last year
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆116Dec 4, 2025Updated 4 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Implemenation of the HIERarchical imagionation On Structured State Space Sequence Models (HIEROS) paper☆22Jul 14, 2024Updated last year
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆192Jan 30, 2026Updated 3 months ago
- [ICLR 2025🎉] This is the official implementation of paper "Robots Pre-Train Robots: Manipulation-Centric Robotic Representation from Lar…☆94Jan 22, 2025Updated last year
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆133Sep 8, 2025Updated 7 months ago
- A Multimodal Generative World Model for Autonomous Driving with Geometric Representations☆13Aug 27, 2025Updated 8 months ago
- Implementation of Latent Diffusion Planning (Amber Xie, Oleh Rybkin, Dorsa Sadigh, Chelsea Finn)☆65Jun 29, 2025Updated 10 months ago
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆370Jul 23, 2025Updated 9 months ago