lyogavin / train_your_own_sora
☆176Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for train_your_own_sora
- VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024☆261Updated 6 months ago
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch☆246Updated 3 months ago
- Official repository for the paper PLLaVA☆581Updated 3 months ago
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆210Updated 2 months ago
- [ICLR 2024] Code for FreeNoise based on VideoCrafter☆384Updated 3 months ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆495Updated 2 months ago
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆224Updated 10 months ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆367Updated 2 months ago
- LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusi…☆432Updated 2 months ago
- LLaVA-Interactive-Demo☆352Updated 3 months ago
- Multimodal Models in Real World☆400Updated last week
- Code repository for T2V-Turbo and T2V-Turbo-v2☆247Updated 2 weeks ago
- Data release for the ImageInWords (IIW) paper.☆200Updated 5 months ago
- An initiative to replicate Sora☆98Updated 7 months ago
- Official implementation of the ECCV paper "SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"☆230Updated 3 weeks ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆574Updated last month
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions☆126Updated 9 months ago
- [CVPR2024] Make Your Dream A Vlog☆415Updated 7 months ago
- ☆145Updated 2 months ago
- ☆258Updated this week
- ☆145Updated 3 weeks ago
- ☆406Updated 7 months ago
- [ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models☆488Updated 9 months ago
- Code for instruction-tuning Stable Diffusion.☆210Updated 8 months ago
- MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation☆190Updated 3 months ago
- An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal …☆363Updated 10 months ago
- [SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters☆253Updated 7 months ago
- ☆165Updated 4 months ago
- Implementation of DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing☆227Updated last year
- I'm back ! Related Sources of Meissonic developed by Community☆202Updated this week