qhfan / MALALinks
[ICCV2025 highlight]Rectifying Magnitude Neglect in Linear Attention
☆34Updated last month
Alternatives and similar repositories for MALA
Users that are interested in MALA are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆105Updated 5 months ago
- Implements VAR+CLIP for text-to-image (T2I) generation☆148Updated 7 months ago
- ☆75Updated last month
- High-performance Image Tokenizers for VAR and AR☆288Updated 4 months ago
- [AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention☆113Updated last year
- Official repository of InLine attention (NeurIPS 2024)☆55Updated 8 months ago
- [CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆172Updated 6 months ago
- [NeurIPS2024 Spotlight] The official implementation of MambaTree: Tree Topology is All You Need in State Space Model☆101Updated last year
- [NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing☆95Updated last week
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆194Updated 2 months ago
- ☆87Updated 5 months ago
- 🚀 Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models☆33Updated last week
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆162Updated 3 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆215Updated last month
- [ICCV25] USP: Unified Self-Supervised Pretraining for Image Generation and Understanding☆89Updated 2 months ago
- [NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks☆129Updated 9 months ago
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆74Updated last week
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆379Updated last month
- Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"☆62Updated last week
- [CVPR 2025] Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training☆82Updated 2 months ago
- [ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models☆95Updated 2 months ago
- [ICCV2025]Generate one 2K image on single 3090 GPU!☆65Updated last week
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆54Updated 5 months ago
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆28Updated 6 months ago
- HoliTom: Holistic Token Merging for Fast Video Large Language Models☆43Updated last month
- This is the official implementation for ControlVAR.☆120Updated 9 months ago
- [ICLR'25] Reconstructive Visual Instruction Tuning☆116Updated 5 months ago
- [CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation☆49Updated 2 months ago
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆123Updated 9 months ago
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆45Updated 5 months ago