Finetuning and inference tools for the CogView4 and CogVideoX model series.
☆117May 14, 2025Updated 9 months ago
Alternatives and similar repositories for CogKit
Users that are interested in CogKit are comparing it to the libraries listed below
Sorting:
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 10 months ago
- Code of the paper "FreePCA:Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Princi…☆28Aug 26, 2025Updated 6 months ago
- CVPRW 2025 paper Progressive Autoregressive Video Diffusion Models: https://arxiv.org/abs/2410.08151☆90May 12, 2025Updated 9 months ago
- Concat-ID: Towards Universal Identity-Preserving Video Synthesis☆65May 7, 2025Updated 9 months ago
- [AAAI-2026]FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation☆457Mar 5, 2025Updated 11 months ago
- Benchmark dataset and code of MSRVTT-Personalization☆52Nov 10, 2025Updated 3 months ago
- Scalable and memory-optimized training of diffusion models☆1,338Jun 4, 2025Updated 8 months ago
- The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …☆50Jun 6, 2025Updated 8 months ago
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆61Jul 31, 2025Updated 7 months ago
- ☆30Aug 21, 2025Updated 6 months ago
- ☆18Oct 24, 2024Updated last year
- Simple Controlnet module for CogvideoX model.☆182Jan 12, 2025Updated last year
- TPDiff: Temporal Pyramid Video Diffusion Model☆23Mar 13, 2025Updated 11 months ago
- Official repo of "Guide3D: Create 3D Avatars from Text and Image Guidance"☆39Aug 23, 2023Updated 2 years ago
- [CVPR 2025] Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion☆43Mar 21, 2025Updated 11 months ago
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆1,485Updated this week
- MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning☆42Sep 3, 2025Updated 5 months ago
- EvoWorld: Evolving Panoramic World Generation with Explicit 3D Memory☆62Jan 13, 2026Updated last month
- [ICCV25] TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers☆41Jul 23, 2025Updated 7 months ago
- CogView4, CogView3-Plus and CogView3(ECCV 2024)☆1,105Mar 29, 2025Updated 11 months ago
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆21Jun 23, 2025Updated 8 months ago
- This is the official code repository for the paper: Towards General Continuous Memory for Vision-Language Models.☆20Jul 3, 2025Updated 7 months ago
- Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"☆301Apr 23, 2025Updated 10 months ago
- [ECCV 2024] 3DPE: Real-time 3D-aware Portrait Editing from a Single Image☆22Sep 15, 2025Updated 5 months ago
- Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Vi…☆236Mar 19, 2025Updated 11 months ago
- [ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention☆627Feb 3, 2026Updated 3 weeks ago
- Official Implementation of "LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis"☆78Aug 25, 2025Updated 6 months ago
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 4 months ago
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力,以及大模型在前沿数学研究中的潜在能力。☆17May 16, 2025Updated 9 months ago
- This is a LoRA model finetuned on Wan-I2V-14B-480P. It turns things in the image into fluffy toys.☆19Nov 10, 2025Updated 3 months ago
- ☆10Nov 18, 2024Updated last year
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- The code for ”T-GRAG: A Dynamic GraphRAG Framework for Resolving Temporal Conflicts and Redundancy in Knowledge Retrieval“☆20Jul 30, 2025Updated 7 months ago
- SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers☆584Jun 5, 2025Updated 8 months ago
- Video Diffusion Transformers are In-Context Learners☆35Jan 6, 2025Updated last year
- [ICCV 2025] Official repository of DiffSim: Taming Diffusion Models for Evaluating Visual Similarity☆29Jul 14, 2025Updated 7 months ago
- Let's finetune video generation models!☆543Sep 15, 2025Updated 5 months ago
- [SIGGRAPH Asia 2023] Official pytorch implementation of "360° Reconstruction From a Single Image Using Space Carved Outpainting"☆17Sep 15, 2023Updated 2 years ago
- [CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step☆341Jul 4, 2025Updated 7 months ago