PKU-YuanGroup / Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
☆11,918Updated this week
Alternatives and similar repositories for Open-Sora-Plan:
Users that are interested in Open-Sora-Plan are comparing it to the libraries listed below
- Open-Sora: Democratizing Efficient Video Production for All☆23,513Updated this week
- Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"☆3,243Updated 10 months ago
- Accepted as [NeurIPS 2024] Spotlight Presentation Paper☆6,216Updated 5 months ago
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,406Updated 9 months ago
- tiny vision language model☆7,560Updated 2 weeks ago
- text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)☆10,926Updated last week
- [NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling:…☆6,894Updated last month
- PyTorch code and models for V-JEPA self-supervised learning from video.☆2,829Updated last week
- Official implementation of AnimateDiff.☆11,124Updated 7 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆21,746Updated 6 months ago
- Latte: Latent Diffusion Transformer for Video Generation.☆1,793Updated last week
- Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding☆3,965Updated last month
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆18,910Updated last week
- The official Meta Llama 3 GitHub site☆28,479Updated last month
- MiniSora: A community aims to explore the implementation path and future development direction of Sora.☆1,259Updated 3 weeks ago
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆6,927Updated 9 months ago
- Large World Model -- Modeling Text and Video with Millions Context☆7,248Updated 4 months ago
- Your image is almost there!☆7,513Updated 7 months ago
- Official inference repo for FLUX.1 models☆20,696Updated last month
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,297Updated last week
- MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.☆7,253Updated 4 months ago
- A series of large language models trained from scratch by developers @01-ai☆7,827Updated 3 months ago
- VideoSys: An easy and efficient system for video generation☆1,939Updated this week
- FreeAskInternet is a completely free, PRIVATE and LOCALLY running search aggregator & answer generate using MULTI LLMs, without GPU neede…☆8,588Updated 10 months ago
- Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model☆3,442Updated 4 months ago
- InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥☆11,471Updated 7 months ago
- Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance☆4,172Updated 8 months ago
- Official Code for Stable Cascade☆6,588Updated 7 months ago
- [CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text☆1,513Updated 3 months ago
- Kolors Team☆4,247Updated 3 months ago