[CVPR 2025] DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
☆30May 13, 2025Updated 9 months ago
Alternatives and similar repositories for DoraCycle
Users that are interested in DoraCycle are comparing it to the libraries listed below
Sorting:
- Edit and Generate Anything in 3D world!☆14Apr 15, 2023Updated 2 years ago
- Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.☆113Jul 27, 2025Updated 7 months ago
- The code repository of UniRL☆51May 30, 2025Updated 9 months ago
- Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation☆111Apr 16, 2025Updated 10 months ago
- ☆57Apr 28, 2025Updated 10 months ago
- FQGAN: Factorized Visual Tokenization and Generation☆59Mar 29, 2025Updated 11 months ago
- A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation☆86Sep 27, 2025Updated 5 months ago
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated 11 months ago
- Muti-human Interactive Talking Dataset☆68Aug 6, 2025Updated 6 months ago
- [ICLR 2026] Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing☆25Jan 27, 2026Updated last month
- ICML 2025 - Impossible Videos☆83Jul 23, 2025Updated 7 months ago
- Repository of GUI Action Narrator☆12Apr 8, 2025Updated 10 months ago
- Glance: Accelerating Diffusion Models with 1 Sample☆152Dec 24, 2025Updated 2 months ago
- ☆27Apr 25, 2025Updated 10 months ago
- UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation☆18Aug 12, 2025Updated 6 months ago
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- This is the project page for the HOSNeRF☆16Dec 11, 2023Updated 2 years ago
- ☆24May 23, 2025Updated 9 months ago
- Computer-Use Agents as Judges for Generative UI☆43Nov 27, 2025Updated 3 months ago
- HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video☆68Dec 12, 2023Updated 2 years ago
- ☆73May 10, 2024Updated last year
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated 11 months ago
- official implementation of the paper "Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability".☆46Dec 25, 2025Updated 2 months ago
- The official repo for LIFT: Language-Image Alignment with Fixed Text Encoders☆42Jun 10, 2025Updated 8 months ago
- [ECCV 2022] AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant☆23Jan 30, 2026Updated last month
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆146Dec 26, 2024Updated last year
- (ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator☆114Mar 21, 2025Updated 11 months ago
- [ICCV 2023] Label-Efficient Online Continual Object Detection in Streaming Video☆23Jan 8, 2024Updated 2 years ago
- Orienting Latent Actions for Video World Modeling☆77Feb 11, 2026Updated 3 weeks ago
- [IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Model☆106Mar 24, 2025Updated 11 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 24, 2026Updated last week
- Official repo for paper "EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture."☆62Dec 16, 2025Updated 2 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation☆38Jul 5, 2025Updated 7 months ago
- This tool allows local LLM usage that can automate tasks without human interventention. The agent can call itself recursively and work on…☆20May 5, 2025Updated 9 months ago
- Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"☆301Apr 23, 2025Updated 10 months ago
- (ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.☆31Aug 7, 2025Updated 6 months ago
- ☆39May 20, 2025Updated 9 months ago
- Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"☆32May 28, 2025Updated 9 months ago