showlab / DoraCycleLinks
[CVPR 2025] DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
β26Updated 2 months ago
Alternatives and similar repositories for DoraCycle
Users that are interested in DoraCycle are comparing it to the libraries listed below
Sorting:
- The code repository of UniRLβ33Updated last month
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption πβ44Updated last week
- Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generationβ109Updated 3 months ago
- β135Updated 2 weeks ago
- β25Updated 3 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoningβ79Updated 3 weeks ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?β60Updated this week
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"β63Updated 9 months ago
- β25Updated 2 months ago
- GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learningβ89Updated last month
- β88Updated 3 weeks ago
- FQGAN: Factorized Visual Tokenization and Generationβ50Updated 3 months ago
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generationβ141Updated last month
- β24Updated last year
- ICML 2025 - Impossible Videosβ68Updated last month
- [NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.β49Updated 9 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesisβ85Updated last year
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generationβ76Updated 4 months ago
- β30Updated 7 months ago
- π This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.β16Updated 2 weeks ago
- β87Updated 3 weeks ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Attenβ¦β51Updated 2 weeks ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initializationβ21Updated 3 months ago
- β34Updated 3 weeks ago
- ICML2025β49Updated last month
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"β47Updated 7 months ago
- [ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generationβ21Updated 3 weeks ago
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.β79Updated 2 months ago
- Official Implementation of VideoDPOβ121Updated last month
- Official repository of "Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning"β32Updated 4 months ago