[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
☆61Jul 8, 2025Updated 7 months ago
Alternatives and similar repositories for HMAR
Users that are interested in HMAR are comparing it to the libraries listed below
Sorting:
- Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".☆43Jun 30, 2025Updated 8 months ago
- ☆21Feb 13, 2026Updated 2 weeks ago
- [ICML 2025] Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM☆20May 22, 2025Updated 9 months ago
- This is the official PyTorch codes for the paper: "Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution"☆24Aug 27, 2025Updated 6 months ago
- Official implement of "AMD: Autoregressive Motion Diffusion"☆20Nov 10, 2024Updated last year
- [ICCV 2025] Preacher: Paper-to-Video Agentic System☆33Sep 1, 2025Updated 6 months ago
- Official Implementation of "Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling"☆15Nov 20, 2023Updated 2 years ago
- Humos paper repository☆27Sep 6, 2025Updated 5 months ago
- [ICCV'25] FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model☆84Jul 24, 2025Updated 7 months ago
- ☆41May 15, 2025Updated 9 months ago
- [NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis☆25Nov 28, 2024Updated last year
- [ICLR 2026] Code for our paper "Next Visual Granularity Generation".☆49Jan 26, 2026Updated last month
- [ICCV 2025] ETA: Efficiency through Thinking Ahead, A Dual Approach to Self-Driving with Large Models☆41Jul 2, 2025Updated 8 months ago
- A framework that allows you to apply Sparse AutoEncoder on any models☆51Jul 11, 2025Updated 7 months ago
- (ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.☆31Aug 7, 2025Updated 6 months ago
- Code and data for UniEgoMotion (ICCV 2025)☆44Nov 11, 2025Updated 3 months ago
- ☆47Jan 26, 2026Updated last month
- ☆63Jul 11, 2025Updated 7 months ago
- [NeurIPS 2024] Official implementation of InterControl☆83Feb 20, 2025Updated last year
- [CVPR2024] Official implementation of the paper: Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning☆40Aug 15, 2025Updated 6 months ago
- Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".☆149Sep 21, 2025Updated 5 months ago
- [ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation☆115Oct 7, 2025Updated 4 months ago
- [ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…☆87Jan 26, 2026Updated last month
- 📷 [CVPR'26] Camera-controlled text-to-video generation, now with intrinsics, distortion and orientation control!☆126Feb 21, 2026Updated last week
- Some of my practices on Algorithms : ) 这个仓库保存着我在 LeetCode、剑指Offer 上的一些解答,代码中保留了必要的注释。不一定是最优的解答,但力保代码简洁易懂。后续还会整合其他题库,如若发现什么错误,希望你能告诉我或帮助我…☆11Dec 3, 2024Updated last year
- Official repository for "Pre- to Post-Contrast Breast MRI Synthesis for Enhanced Tumour Segmentation"☆12Jan 31, 2024Updated 2 years ago
- Vision Transformer (ViT) models, with their attention mechanisms, revolutionized computer vision. By merging Class Activation Map (CAM) a…☆13Aug 14, 2023Updated 2 years ago
- Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation☆12Dec 5, 2025Updated 2 months ago
- ☆12Aug 25, 2023Updated 2 years ago
- Muti-human Interactive Talking Dataset☆68Aug 6, 2025Updated 6 months ago
- [NeurIPS '25 Spotlight] Official Pytorch implementation of "Vision Transformers Don't Need Trained Registers"☆172Sep 19, 2025Updated 5 months ago
- code release of paper "DIMOS: Synthesizing Diverse Human Motions in 3D Indoor Scenes"☆98Mar 15, 2025Updated 11 months ago
- [ICLR 2025] Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation☆45Mar 13, 2025Updated 11 months ago
- ☆13Jul 28, 2024Updated last year
- ☆12Mar 5, 2024Updated last year
- EmoCapCLIP: Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions☆20Jul 29, 2025Updated 7 months ago
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 3 months ago
- ☆10May 9, 2019Updated 6 years ago
- ☆22Nov 18, 2025Updated 3 months ago