huang-yh/Owl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huang-yh/Owl)

huang-yh / Owl

☆52

Alternatives and similar repositories for Owl

Users that are interested in Owl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FuchenUSTC / VideoStudio
View on GitHub
☆33Jul 5, 2024Updated 2 years ago
huang-yh / SpectralAR
View on GitHub
[ICCV 25]SpectralAR: Spectral Autoregressive Visual Generation
☆36Jun 13, 2025Updated last year
lambert-x / VideoAuteur
View on GitHub
VideoAuteur: Towards Long Narrative Video Generation
☆44Oct 22, 2025Updated 8 months ago
microsoft / Reducio-VAE
View on GitHub
☆217Feb 11, 2025Updated last year
hehao13 / CameraCtrl
View on GitHub
☆656May 24, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
desaixie / pa_vdm
View on GitHub
CVPRW 2025 paper Progressive Autoregressive Video Diffusion Models: https://arxiv.org/abs/2410.08151
☆89May 12, 2025Updated last year
NVlabs / TokenBench
View on GitHub
A Video Tokenizer Evaluation Dataset
☆157Jan 13, 2025Updated last year
THU-SI / DreamCinema
View on GitHub
DreamCinema: Cinematic Transfer with Free Camera and 3D Character
☆96Jun 13, 2025Updated last year
wzzheng / Doe
View on GitHub
Doe-1: Closed-Loop Autonomous Driving with Large World Model
☆113Jan 21, 2025Updated last year
Robertwyq / Drivingdojo
View on GitHub
[NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model
☆87Dec 5, 2024Updated last year
rongyaofang / PUMA
View on GitHub
Empowering Unified MLLM with Multi-granular Visual Generation
☆132Jan 16, 2025Updated last year
huang-yh / Terra
View on GitHub
☆31Oct 17, 2025Updated 9 months ago
microsoft / VidTok
View on GitHub
a family of versatile and state-of-the-art video tokenizers.
☆453Sep 1, 2025Updated 10 months ago
chen-wl20 / GenWorld
View on GitHub
GenWorld: Towards Detecting AI-generated Real-world Simulation Videos
☆37Jun 13, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mihirp1998 / VADER
View on GitHub
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…
☆315Mar 12, 2025Updated last year
lmbxmu / CutDiffusion
View on GitHub
CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
☆27Oct 9, 2025Updated 9 months ago
XMUDeepLIT / AVG-LLaVA
View on GitHub
Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"
☆33Oct 12, 2024Updated last year
ali-vilab / CDT
View on GitHub
Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach
☆17Apr 2, 2025Updated last year
Jialuo-Li / Science-T2I
View on GitHub
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
☆62Mar 31, 2026Updated 3 months ago
baaivision / NOVA
View on GitHub
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
☆656Oct 29, 2025Updated 8 months ago
chen-wl20 / SceneCompleter
View on GitHub
SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis
☆36Jun 13, 2025Updated last year
open-mmlab / Live2Diff
View on GitHub
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
☆199Jul 22, 2024Updated last year
xizaoqu / WorldMem
View on GitHub
[NeurIPS 2025] WorldMem: Long-term Consistent World Simulation with Memory
☆379Feb 21, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bestzzhang / SGEdit-code
View on GitHub
☆19May 24, 2026Updated last month
snap-research / VIMI
View on GitHub
☆13Jul 10, 2024Updated 2 years ago
Nithin-GK / MaxFusion
View on GitHub
[ECCV'24] MaxFusion: Plug & Play multimodal generation in text to image diffusion models
☆27Nov 2, 2024Updated last year
lixirui142 / UniCon
View on GitHub
UniCon: A Simple Approach to Unifying Diffusion-based Conditional Generation (ICLR 2025)
☆38Jun 21, 2025Updated last year
PKU-YuanGroup / Next-Patch-Prediction
View on GitHub
[AAAI26] Next Patch Prediction
☆129Jan 2, 2025Updated last year
justincui03 / Self-Forcing-Plus-Plus
View on GitHub
Official Repo for Self-Forcing++ High Quality Long Video Generation
☆264Oct 13, 2025Updated 9 months ago
getterupper / PreWorld
View on GitHub
[ICLR 2025] Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
☆56Feb 14, 2025Updated last year
aimagelab / HySAC
View on GitHub
Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025
☆31Apr 8, 2025Updated last year
AdaCache-DiT / AdaCache
View on GitHub
Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"
☆172Nov 5, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Osilly / Interleaving-Reasoning-Generation
View on GitHub
[ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…
☆100Jan 26, 2026Updated 5 months ago
wdrink / OpenTokenizer
View on GitHub
☆21Jan 17, 2025Updated last year
JLChen-C / OccProphet
View on GitHub
[ICLR 2025] OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework
☆60Mar 18, 2026Updated 4 months ago
FreedomIntelligence / ShareGPT-4o-Image
View on GitHub
☆285Jul 22, 2025Updated last year
martian422 / MaskGRPO
View on GitHub
The official implementation of MaskGRPO: Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models. (ICLR 2026, arxiv…
☆19Jan 27, 2026Updated 5 months ago
mbzuai-oryx / TrackingMeetsLMM
View on GitHub
☆10Apr 7, 2025Updated last year
Yuanshi9815 / Video-Infinity
View on GitHub
Video-Infinity generates long videos quickly using multiple GPUs without extra training.
☆191Aug 4, 2024Updated last year