minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora
☆40Mar 25, 2024Updated last year
Alternatives and similar repositories for MiniSora-DiT
Users that are interested in MiniSora-DiT are comparing it to the libraries listed below
Sorting:
- MiniSora: A community aims to explore the implementation path and future development direction of Sora.☆1,285Feb 18, 2025Updated last year
- ☆27Sep 20, 2023Updated 2 years ago
- MICCAI 2022 MELA Challenge: Mediastinal Lesion Analysis (3D Detection)☆11Jun 30, 2022Updated 3 years ago
- ☆16Dec 15, 2021Updated 4 years ago
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion☆45Aug 1, 2024Updated last year
- A modified version of origin Magic Animate (https://showlab.github.io/magicanimate/)☆20Feb 27, 2024Updated 2 years ago
- An official pytorch implementation of AAAI 2024 paper "Latent Space Editing in Transformer-based Flow Matching"☆51Apr 10, 2024Updated last year
- This project fixes the Wav2Lip project so that it can run on Python 3.9. Wav2Lip is a project that can be used to lip-sync videos to audi…☆17Aug 31, 2023Updated 2 years ago
- [TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.☆1,920Oct 30, 2025Updated 4 months ago
- Crawler and cleaner of data for novelai embedding's training☆21May 22, 2025Updated 9 months ago
- 👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"☆33Jul 28, 2025Updated 7 months ago
- [CVPR 2023] Official PyTorch implementation of MoStGAN-V☆24Jun 15, 2023Updated 2 years ago
- ☆27Jun 27, 2023Updated 2 years ago
- Official codes for the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking"☆56Feb 2, 2026Updated last month
- Code base for zero-shot action localization through spatial-aware object embeddings☆25Nov 3, 2017Updated 8 years ago
- ☆29May 13, 2024Updated last year
- [CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers☆674Oct 25, 2024Updated last year
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆105Jun 23, 2024Updated last year
- KAN-based Fusion of Dual Domain for Audio-Driven Landmarks Generation of the model can help you generate an sequence of facial lanmarks f…☆30Oct 28, 2025Updated 4 months ago
- official implementation of the paper: Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transform…☆28May 18, 2023Updated 2 years ago
- ☆28Oct 1, 2023Updated 2 years ago
- [IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models☆948Nov 13, 2024Updated last year
- ☆21Dec 14, 2025Updated 2 months ago
- 基于MuseTalk的数字人代码。☆35Sep 14, 2024Updated last year
- ☆34Dec 29, 2025Updated 2 months ago
- Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV …☆286May 1, 2024Updated last year
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆41Jun 22, 2024Updated last year
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆39Jun 20, 2024Updated last year
- Reranking for Multi-objective Optimized Recommender Systems☆11Aug 3, 2023Updated 2 years ago
- The project is an official implementation of our paper " RSGNet: Relation based Skeleton Graph Network for Crowded Scenes Pose Estimation…☆10Dec 9, 2020Updated 5 years ago
- SyncTalkFace: Talking Face Generation for Precise Lip-syncing via Audio-Lip Memory☆33Nov 3, 2022Updated 3 years ago
- A free and open-source focus stacking software that supports multi-focus image alignment and fusion.☆20Feb 5, 2026Updated last month
- 李宏毅机器学习课程笔记☆10Jul 3, 2022Updated 3 years ago
- Virtual news production using Tacotron2 and Wav2Lip☆11Nov 14, 2023Updated 2 years ago
- VideoSys: An easy and efficient system for video generation☆2,016Aug 27, 2025Updated 6 months ago
- Code release for AccDiffusion (ECCV 2024)☆93Nov 19, 2024Updated last year
- Implements VAR+CLIP for text-to-image (T2I) generation☆147Jan 23, 2025Updated last year
- Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training (NeurIPS 2024)☆481Oct 18, 2024Updated last year