PKU-YuanGroup / UAEView external linksLinks
Official repository for the UAE paper, unified-GRPO, and unified-Bench
☆158Sep 12, 2025Updated 5 months ago
Alternatives and similar repositories for UAE
Users that are interested in UAE are comparing it to the libraries listed below
Sorting:
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward☆60Nov 27, 2025Updated 2 months ago
- UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation☆839Dec 23, 2025Updated last month
- [AAAI26] Next Patch Prediction☆132Jan 2, 2025Updated last year
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"☆172Dec 17, 2025Updated last month
- WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction☆60Sep 3, 2025Updated 5 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆185Nov 6, 2025Updated 3 months ago
- GPT as a Monte Carlo Language Tree: A Probabilistic Perspective☆45Jan 18, 2025Updated last year
- Official Implementation of Paper Transfer between Modalities with MetaQueries☆303Oct 12, 2025Updated 4 months ago
- [ICLR 2026] Lumos Project: Frontier video unified model research by Alibaba DAMO Academy.☆151Jan 27, 2026Updated 2 weeks ago
- ☆189Dec 17, 2024Updated last year
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆86Feb 27, 2025Updated 11 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆95Mar 1, 2025Updated 11 months ago
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"☆427Jun 20, 2025Updated 7 months ago
- 【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".☆38Dec 5, 2024Updated last year
- [ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…☆87Jan 26, 2026Updated 2 weeks ago
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding☆508Nov 14, 2025Updated 3 months ago
- [ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…☆361Feb 5, 2026Updated last week
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆234Jan 22, 2026Updated 3 weeks ago
- FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection☆24Updated this week
- Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization☆20Jan 27, 2026Updated 2 weeks ago
- [CoRL 2025] Robot Learning from Any Images☆34Nov 11, 2025Updated 3 months ago
- [NeurIPS 2025 D&B🔥] ImgEdit: A Unified Image Editing Dataset and Benchmark☆276Nov 5, 2025Updated 3 months ago
- Official implementation of BLIP3o-Series☆1,635Nov 29, 2025Updated 2 months ago
- Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".☆35Jul 10, 2025Updated 7 months ago
- Code for paper "Rethinking Text-based Protein Understanding: Retrieval or LLM?"☆18Oct 7, 2025Updated 4 months ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 2 months ago
- logit lens for VGGT☆26Dec 2, 2025Updated 2 months ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆130May 16, 2025Updated 9 months ago
- ICML2025☆63Aug 28, 2025Updated 5 months ago
- Code for FreeTraj, a tuning-free method for trajectory-controllable video generation☆111Sep 19, 2025Updated 4 months ago
- ☆38Jun 12, 2025Updated 8 months ago
- Video-GPT via Next Clip Diffusion.☆44Jun 2, 2025Updated 8 months ago
- Official PyTorch implementation of the paper "Equivariant Image Modeling"(https://arxiv.org/abs/2503.18948)☆34Aug 1, 2025Updated 6 months ago
- ☆21Dec 10, 2025Updated 2 months ago
- Notebooks for managing NeurIPS 2014 and analysing the NeurIPS experiment.☆13May 22, 2024Updated last year
- A Decade of Action Quality Assessment: Largest Systematic Survey of Trends, Challenges, and Future Directions☆15Jan 22, 2026Updated 3 weeks ago
- [CVPR 2025] The First Investigation of CoT Reasoning (RL, TTS, Reflection) in Image Generation☆855May 23, 2025Updated 8 months ago
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆648Oct 16, 2024Updated last year
- Native Multimodal Models are World Learners☆1,456Dec 30, 2025Updated last month