lyogavin / train_your_own_soraLinks

☆194

Alternatives and similar repositories for train_your_own_sora

Users that are interested in train_your_own_sora are comparing it to the libraries listed below

Sorting:

SHI-Labs / VCoder
[CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models
☆279Updated last year
lucidrains / lumiere-pytorch
Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch
☆281Updated last year
invictus717 / InteractiveVideo
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
☆130Updated last year
DiffusionGPT / DiffusionGPT
☆208Updated last year
open-mmlab / Live2Diff
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
☆198Updated last year
HL-hanlin / VideoDirectorGPT
official implementation of VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning (COLM 2024)
☆176Updated last year
eric-ai-lab / swap-anything
Official implementation of the ECCV paper "SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"
☆265Updated last year
bytedance / MoMA
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
☆234Updated last year
Yuanshi9815 / Video-Infinity
Video-Infinity generates long videos quickly using multiple GPUs without extra training.
☆187Updated last year
Vchitect / Vlogger
[CVPR2024] Make Your Dream A Vlog
☆428Updated 6 months ago
aim-uofa / AutoStory
[IJCV'24] AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort
☆151Updated 11 months ago
WangWenhao0716 / VidProM
[NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
☆166Updated last year
Ji4chenLi / t2v-turbo
Code repository for T2V-Turbo and T2V-Turbo-v2
☆306Updated 9 months ago
huggingface / instruction-tuned-sd
Code for instruction-tuning Stable Diffusion.
☆245Updated last year
Alpha-VLLM / Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…
☆629Updated last month
Zeqiang-Lai / Mini-DALLE3
Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
☆312Updated last year
google / imageinwords
Data release for the ImageInWords (IIW) paper.
☆222Updated last year
viiika / Meissonic
[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image…
☆336Updated 2 weeks ago
AILab-CVC / Animate-A-Story
Retrieval-Augmented Video Generation for Telling a Story
☆258Updated last year
kyegomez / movie-gen
An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community …
☆58Updated this week
eai-lab / On-device-Sora
[arXiv] On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices
☆126Updated 4 months ago
aim-uofa / MovieDreamer
[ICLR'25] MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
☆317Updated last year
lucidrains / MIMO-pytorch
Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Group
☆136Updated last year
AILab-CVC / TaleCrafter
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
☆268Updated last year
facebookresearch / MovieGenBench
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
☆429Updated 8 months ago
Jeff-LiangF / FlowVid
☆143Updated last year
ai-forever / KandinskyVideo
KandinskyVideo — multilingual end-to-end text2video latent diffusion model
☆184Updated last year
AILab-CVC / FreeNoise
[ICLR 2024] Code for FreeNoise based on VideoCrafter
☆419Updated 2 months ago
poloclub / ClickDiffusion
ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing
☆69Updated last year
modelscope / lite-sora
An initiative to replicate Sora
☆104Updated last year