Yanjun-Zhao / HiZOOLinks

Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer

☆16

Alternatives and similar repositories for HiZOO

Users that are interested in HiZOO are comparing it to the libraries listed below

Sorting:

OPTML-Group / DeepZero
[ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Di…
☆66Updated 10 months ago
ZO-Bench / ZO-LLM
[ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".
☆109Updated last month
andyjm3 / SLTrain
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)
☆32Updated 9 months ago
Shiweiliuiiiiiii / In-Time-Over-Parameterization
[ICML 2021] "Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training" by Shiwei Liu, Lu Yin, De…
☆45Updated last year
lzhangbv / eva
[ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation
☆12Updated 2 years ago
nblt / DLDR
[TPAMI 2023] Low Dimensional Landscape Hypothesis is True: DNNs can be Trained in Tiny Subspaces
☆42Updated 3 years ago
osehmathias / lisa
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆33Updated last year
lliai / LoRA-Zoo
Awesome-Low-Rank-Adaptation
☆115Updated 9 months ago
yxli2123 / LoSparse
☆59Updated last year
zyushun / hessian-spectrum
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆60Updated 5 months ago
zyxxmu / DSnoT
Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…
☆49Updated last year
ycjing / Awesome-Model-Merging
A curated list of Model Merging methods.
☆92Updated 10 months ago
hahnyuan / ASVD4LLM
Activation-aware Singular Value Decomposition for Compressing Large Language Models
☆74Updated 9 months ago
ZhengaoLi / DISP-LLM-Dimension-Independent-Structural-Pruning
An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.
☆21Updated this week
JingXuTHU / Random-Masking-Finds-Winning-Tickets-for-Parameter-Efficient-Fine-tuning
☆14Updated last year
imagination-research / EEP
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
☆18Updated 8 months ago
falcon-xu / early-exit-papers
A curated list of early exiting (LLM, CV, NLP, etc)
☆58Updated 11 months ago
VITA-Group / GraNet
[Neurips 2021] Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
☆31Updated 2 years ago
UbiquitousLearning / Backpropagation_Free_Training_Survey
☆24Updated last year
boone891214 / MEST
[NeurIPS‘2021] "MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge", Geng Yuan, Xiaolong Ma, Yanzhi Wang et al…
☆18Updated 3 years ago
biomedical-cybernetics / Relative-importance-and-activation-pruning
☆49Updated last year
tanganke / subspace_fusion
Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"
☆13Updated last year
stephenqz / OATS
Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition
☆13Updated 3 months ago
nik-dim / tall_masks
Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]
☆47Updated 9 months ago
nsfzyzz / loss_landscape_taxonomy
[NeurIPS 2021] code for "Taxonomizing local versus global structure in neural network loss landscapes" https://arxiv.org/abs/2107.11228
☆19Updated 3 years ago
pvti / Awesome-Tensor-Decomposition
😎 A curated list of tensor decomposition resources for model compression.
☆77Updated last week
LGrCo / L-GreCo
AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION
☆14Updated last year
aim-uofa / LoRAPrune
☆57Updated 7 months ago
cjyaras / deep-lora-transformers
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)
☆13Updated last year
QingruZhang / PLATON
This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).
☆46Updated 2 years ago