UbiquitousLearning / Backpropagation_Free_Training_SurveyLinks
☆23Updated last year
Alternatives and similar repositories for Backpropagation_Free_Training_Survey
Users that are interested in Backpropagation_Free_Training_Survey are comparing it to the libraries listed below
Sorting:
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆104Updated 11 months ago
- [ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Di…☆57Updated 7 months ago
- Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer☆15Updated 3 months ago
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆31Updated 7 months ago
- Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)☆14Updated 10 months ago
- ☆35Updated 2 years ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆36Updated last year
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆59Updated 2 months ago
- Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs☆16Updated 5 months ago
- The official implementation of TinyTrain [ICML '24]☆22Updated 10 months ago
- [EMNLP 24] Source code for paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tu…☆11Updated 5 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆82Updated 7 months ago
- ☆13Updated last year
- [NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training☆35Updated 2 months ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆20Updated 11 months ago
- Implementation of CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation☆22Updated 3 months ago
- Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆37Updated 4 months ago
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆68Updated 7 months ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆31Updated last year
- LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters☆35Updated 3 months ago
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Updated last year
- A curated list of early exiting (LLM, CV, NLP, etc)☆53Updated 9 months ago
- ☆67Updated 6 months ago
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆46Updated 2 years ago
- [NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models☆23Updated 2 months ago
- A curated list of Model Merging methods.☆92Updated 8 months ago
- Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"☆31Updated 2 months ago
- ☆25Updated 9 months ago
- ☆50Updated 6 months ago
- This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).☆48Updated last year