Concat-ID: Towards Universal Identity-Preserving Video Synthesis
☆66May 7, 2025Updated 9 months ago
Alternatives and similar repositories for Concat-ID
Users that are interested in Concat-ID are comparing it to the libraries listed below
Sorting:
- Blending Custom Photos with Video Diffusion Transformers☆48Jan 21, 2025Updated last year
- Benchmark dataset and code of MSRVTT-Personalization☆52Nov 10, 2025Updated 3 months ago
- FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation☆78Aug 20, 2025Updated 6 months ago
- [CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition☆828Aug 30, 2025Updated 6 months ago
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 9 months ago
- (NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models☆22Nov 20, 2024Updated last year
- Unifying Specialized Visual Encoders for Video Language Models☆25Nov 22, 2025Updated 3 months ago
- Finetuning and inference tools for the CogView4 and CogVideoX model series.☆118May 14, 2025Updated 9 months ago
- ☆13Feb 2, 2025Updated last year
- ☆22Jan 26, 2026Updated last month
- Official Implementation of VideoDPO☆160Jun 1, 2025Updated 9 months ago
- Official model implementation and benchmark evaluation repository of <AnyEdit: Unified High-Quality Image Edit with Any Idea>☆31Jul 18, 2025Updated 7 months ago
- SkyReels-A2: Compose anything in video diffusion transformers☆704Jun 3, 2025Updated 9 months ago
- Official code of "Edit Transfer: Learning Image Editing via Vision In-Context Relations"☆88Jun 6, 2025Updated 8 months ago
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 11 months ago
- An MCP server providing intelligent transcript processing capabilities, featuring natural formatting, contextual repair, and smart summar…☆18Mar 14, 2025Updated 11 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆86Feb 27, 2025Updated last year
- CVPRW 2025 paper Progressive Autoregressive Video Diffusion Models: https://arxiv.org/abs/2410.08151☆90May 12, 2025Updated 9 months ago
- [Siggraph Asia 2025] Official code release of our paper "Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy"☆58Sep 26, 2025Updated 5 months ago
- ☆386Jun 6, 2024Updated last year
- [ICLR 2026] 🐻 Uniform Discrete Diffusion with Metric Path for Video Generation☆106Feb 6, 2026Updated 3 weeks ago
- [AAAI'26] Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression☆19Dec 21, 2025Updated 2 months ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆46Sep 19, 2025Updated 5 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- [ICCV 2025] MagicMirror: ID-Preserved Video Generation in Video Diffusion Transformers☆128Jun 26, 2025Updated 8 months ago
- ☆23Jul 20, 2025Updated 7 months ago
- On Path to Multimodal Generalist: General-Level and General-Bench☆18Jul 11, 2025Updated 7 months ago
- Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment☆1,488Sep 11, 2025Updated 5 months ago
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing☆72Jul 13, 2025Updated 7 months ago
- Scalable and memory-optimized training of diffusion models☆1,341Jun 4, 2025Updated 9 months ago
- MoviiGen 1.1: Towards Cinematic-Quality Video Generative Models☆183Jul 21, 2025Updated 7 months ago
- [ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers☆463Dec 6, 2025Updated 2 months ago
- Official Implementation for Diffusion Models Without Classifier-free Guidance☆171Feb 18, 2025Updated last year
- 📹 A more flexible framework that can generate videos at any resolution and creates videos from images.☆1,929Updated this week
- The dataset CoLan-150K and the concept decomposition in the paper Concept Lancet (CVPR 2025)☆20Jan 18, 2026Updated last month
- KMM: Key Frame Mask Mamba for Extended Motion Generation☆19Sep 22, 2025Updated 5 months ago
- [ICCV2025] WikiAutoGen offical page☆24Feb 6, 2026Updated 3 weeks ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 6 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Aug 4, 2024Updated last year