YuanheZ / LoRA-OneLinks
LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently (ICML2025 Oral)
☆24Updated last month
Alternatives and similar repositories for LoRA-One
Users that are interested in LoRA-One are comparing it to the libraries listed below
Sorting:
- ICLR 2025☆29Updated 6 months ago
- toy reproduction of Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts☆25Updated last year
- Official implementation of ICLR 2025 'LORO: Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization'☆13Updated 6 months ago
- Data distillation benchmark☆71Updated 5 months ago
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆119Updated 6 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆32Updated last year
- A generalized framework for subspace tuning methods in parameter efficient fine-tuning.☆160Updated 4 months ago
- dParallel: Learnable Parallel Decoding for dLLMs☆42Updated last month
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆53Updated 7 months ago
- Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer☆16Updated last year
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆109Updated 4 months ago
- Official implementation of "Diffusion Language Models Know the Answer Before Decoding"☆39Updated 2 months ago
- Code for CVPR 2024 Oral "Neural Lineage"☆17Updated last year
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆61Updated last month
- [NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression☆51Updated 2 weeks ago
- [NeurIPS 2025] Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".☆92Updated 3 weeks ago
- ☆61Updated 11 months ago
- EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling☆164Updated 2 weeks ago
- ☆17Updated 2 years ago
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆135Updated 7 months ago
- [ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"☆47Updated last year
- Paper survey of efficient computation for large scale models.☆34Updated 11 months ago
- ICLR 2024, Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching☆102Updated last year
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching☆116Updated last year
- The loss landscape of Large Language Models resemble basin!☆33Updated 4 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆46Updated last year
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆46Updated last year
- ☆24Updated 4 months ago
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆107Updated last month
- ☆19Updated 10 months ago