Video-as-Agent/VideoAgent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Video-as-Agent/VideoAgent)

Video-as-Agent / VideoAgent

Official implementation of "Self-Improving Video Generation"

☆77

Alternatives and similar repositories for VideoAgent

Users that are interested in VideoAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

video-to-action / video-to-action-release
View on GitHub
[ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration
☆62May 4, 2025Updated last year
Kmcode1 / SG-I2V
View on GitHub
This is the official implementation of SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation.
☆116Nov 26, 2024Updated last year
jialuli-luka / Video-MSG
View on GitHub
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
☆27Apr 14, 2025Updated last year
argmax-ai / aime
View on GitHub
Official repository for our paper on "Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models"
☆13Dec 4, 2023Updated 2 years ago
flow-diffusion / AVDC
View on GitHub
Official repository of Learning to Act from Actionless Videos through Dense Correspondences.
☆255Apr 25, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
brendancopley / mcp-chain-of-draft-prompt-tool
View on GitHub
MCP prompt tool applying Chain-of-Draft (CoD) reasoning - BYOLLM
☆19Sep 8, 2025Updated 8 months ago
google-deepmind / physics-IQ-benchmark
View on GitHub
Benchmarking physical understanding in generative video models
☆291May 5, 2026Updated 2 weeks ago
mihirp1998 / VADER
View on GitHub
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…
☆312Mar 12, 2025Updated last year
bytedance / CascadeV
View on GitHub
DiT for VAE (and Video Generation)
☆35Sep 2, 2024Updated last year
csmile-1006 / REDS_agent
View on GitHub
Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)
☆19Apr 11, 2025Updated last year
YeLuoSuiYou / openstorypp
View on GitHub
We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.
☆18Aug 30, 2024Updated last year
video-language-planning / vlp_code
View on GitHub
☆81May 23, 2025Updated 11 months ago
clvrai / boss
View on GitHub
Code for the paper Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance, accepted to CoRL 2023 as an…
☆34Jul 15, 2025Updated 10 months ago
ShmuelRonen / ComfyUI-Veo2-Experimental
View on GitHub
A custom node extension for ComfyUI that integrates Google's Veo 2 text-to-video generation capabilities.
☆32Apr 12, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Large-Trajectory-Model / ATM
View on GitHub
Official codebase for "Any-point Trajectory Modeling for Policy Learning"
☆276Jun 19, 2025Updated 11 months ago
LatentActionPretraining / LAPA
View on GitHub
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆523Jan 22, 2025Updated last year
world-model-eval / world-model-eval
View on GitHub
Code for "Evaluating Robot Policies in a World Model".
☆91Nov 6, 2025Updated 6 months ago
thuml / RLVR-World
View on GitHub
Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934
☆251Oct 28, 2025Updated 6 months ago
LaVi-Lab / EgoMask
View on GitHub
[ICCV 2025] "Fine-grained Spatiotemporal Grounding on Egocentric Videos"
☆23Nov 23, 2025Updated 5 months ago
nicklashansen / puppeteer
View on GitHub
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
☆207Sep 18, 2025Updated 8 months ago
LucipherDev / Flux.1-dev-PuLID-jupyter
View on GitHub
Jupyter notebooks for PuLID face transfer with Flux.1 dev. Able to run on Google Colab Free Tier
☆18Dec 18, 2024Updated last year
monkek123King / CLD
View on GitHub
Code release for: Controllable Layer Decomposition for Reversible Multi-Layer Image Generation
☆47Dec 7, 2025Updated 5 months ago
google-research / language-table
View on GitHub
Suite of human-collected datasets and a multi-task continuous control benchmark for open vocabulary visuolinguomotor learning.
☆356May 11, 2026Updated last week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
google-deepmind / lm_act
View on GitHub
LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations
☆28May 21, 2025Updated last year
tinnerhrhe / VPDD
View on GitHub
☆47Nov 28, 2024Updated last year
robot-colosseum / robot-colosseum
View on GitHub
A Benchmark for Evaluating Generalization for Robotic Manipulation
☆147Mar 3, 2025Updated last year
Ray-1026 / LightsOut-official
View on GitHub
[ICCV 2025] LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal
☆28Oct 20, 2025Updated 7 months ago
MaxSobolMark / PolicyAgnosticRL
View on GitHub
☆91Aug 4, 2025Updated 9 months ago
Ji4chenLi / t2v-turbo
View on GitHub
Code repository for T2V-Turbo and T2V-Turbo-v2
☆313Jan 31, 2025Updated last year
TIGER-AI-Lab / VideoScore
View on GitHub
official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]
☆119Dec 4, 2025Updated 5 months ago
Snagnar / Hieros
View on GitHub
Implemenation of the HIERarchical imagionation On Structured State Space Sequence Models (HIEROS) paper
☆22Jul 14, 2024Updated last year
showlab / T2VScore
View on GitHub
T2VScore: Towards A Better Metric for Text-to-Video Generation
☆81Apr 10, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
luccachiang / robots-pretrain-robots
View on GitHub
[ICLR 2025🎉] This is the official implementation of paper "Robots Pre-Train Robots: Manipulation-Centric Robotic Representation from Lar…
☆95Jan 22, 2025Updated last year
jmwang0117 / Video4Robot
View on GitHub
List of papers on video-centric robot learning
☆23Nov 16, 2024Updated last year
Hritikbansal / videophy
View on GitHub
Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics
☆196Jan 30, 2026Updated 3 months ago
OpenDriveLab / CLOVER
View on GitHub
[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
☆134Sep 8, 2025Updated 8 months ago
fzi-forschungszentrum-informatik / muvo
View on GitHub
A Multimodal Generative World Model for Autonomous Driving with Geometric Representations
☆13Aug 27, 2025Updated 8 months ago
amberxie88 / latent_diffusion_planning
View on GitHub
Implementation of Latent Diffusion Planning (Amber Xie, Oleh Rybkin, Dorsa Sadigh, Chelsea Finn)
☆65Jun 29, 2025Updated 10 months ago
ShuangLI59 / unified_video_action
View on GitHub
Official PyTorch Implementation of Unified Video Action Model (RSS 2025)
☆374Jul 23, 2025Updated 9 months ago