OPTML-Group/DeepZero

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OPTML-Group/DeepZero)

OPTML-Group / DeepZero

[ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Diffenderfer, Jiancheng Liu, Konstantinos Parasyris, Yihua Zhang, Zheng Zhang, Bhavya Kailkhura, Sijia Liu

☆72

Alternatives and similar repositories for DeepZero

Users that are interested in DeepZero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MathIsAll / ZO-AdaMU
View on GitHub
This project is a implementation in PyTorch for ZO-AdaMU optimization: Adapting Perturbation with the Momentum and Uncertainty in Zeroth-…
☆15Dec 12, 2023Updated 2 years ago
ZO-Bench / ZO-LLM
View on GitHub
[ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".
☆128Jul 6, 2025Updated last year
amazon-science / mezo_svrg
View on GitHub
Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"
☆12Jun 25, 2024Updated 2 years ago
zimingyy / SubZero
View on GitHub
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces (ICCV 2025)
☆20Nov 22, 2024Updated last year
Yanjun-Zhao / HiZOO
View on GitHub
Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer
☆26Feb 11, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yifanycc / AdaZeta
View on GitHub
[EMNLP 24] Source code for paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tu…
☆13Dec 15, 2024Updated last year
princeton-nlp / MeZO
View on GitHub
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
☆1,168Jan 11, 2024Updated 2 years ago
damon-demon / Black-Box-Defense
View on GitHub
Robustify Black-Box Models (ICLR'22 - Spotlight)
☆23Jan 29, 2023Updated 3 years ago
maifoundations / QZO
View on GitHub
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization
☆20Sep 17, 2025Updated 10 months ago
orobix / fwdgrad
View on GitHub
Implementation of "Gradients without backpropagation" paper (https://arxiv.org/abs/2202.08587) using functorch
☆114Jun 14, 2023Updated 3 years ago
lzhangbv / eva
View on GitHub
[ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation
☆12Jul 31, 2023Updated 2 years ago
Chavdarova / LAGAN-Lookahead_Minimax
View on GitHub
Source code for "Taming GANs with Lookahead–Minmax", ICLR 2021.
☆15Mar 28, 2021Updated 5 years ago
osehmathias / lisa
View on GitHub
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆38Apr 4, 2024Updated 2 years ago
yunyuntsai / Black-box-Adversarial-Reprogramming
View on GitHub
Code for "Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources". (IC…
☆38Nov 14, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
OptimAI-Lab / Minimalist_LLM_Pretraining
View on GitHub
[ICML 2026] Memory-Efficient LLM Pretraining via Minimalist Optimizer Design
☆21May 26, 2026Updated last month
Astuary / Spry
View on GitHub
Code for "Thinking Forward: Memory-Efficient Federated Finetuning of Language Models" (NeurIPS 2024). Spry is a federated learning al…
☆13Oct 8, 2024Updated last year
OPTML-Group / ILM-VP
View on GitHub
[CVPR23] "Understanding and Improving Visual Prompting: A Label-Mapping Perspective" by Aochuan Chen, Yuguang Yao, Pin-Yu Chen, Yihua Zha…
☆52Sep 17, 2023Updated 2 years ago
nikhilvyas / SOAP
View on GitHub
☆273Dec 2, 2024Updated last year
Gunale0926 / Grams
View on GitHub
Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 Workshop)
☆17Mar 6, 2025Updated last year
intel / TVP
View on GitHub
☆15Aug 4, 2025Updated 11 months ago
RyanWangZf / PAC-Bayes-IB
View on GitHub
Official repo for PAC-Bayes Information Bottleneck. ICLR 2022.
☆49May 11, 2022Updated 4 years ago
ZidongLiu / DeComFL
View on GitHub
[ICLR 2025] Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
☆28Jan 27, 2026Updated 5 months ago
machilusZ / FastGen
View on GitHub
This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
☆44Aug 14, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RU-System-Software-and-Security / FeatureRE
View on GitHub
☆27Nov 9, 2022Updated 3 years ago
rmeertens / Simplest-Tensorflow-Tensorboard-MNIST-Embedding-Visualisation
View on GitHub
☆19Apr 10, 2017Updated 9 years ago
haolibai / APS-channel-search
View on GitHub
Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020
☆21Nov 15, 2020Updated 5 years ago
SFU-HiAccel / uBench
View on GitHub
[FPGA'21] Microbenchmarks for Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers
☆31Dec 16, 2021Updated 4 years ago
rkteddy / channel-Lipschitzness-based-pruning
View on GitHub
Source code for ECCV 2022 Poster: Data-free Backdoor Removal based on Channel Lipschitzness
☆36Jan 9, 2023Updated 3 years ago
cg563 / low-frequency-adversarial
View on GitHub
☆30Jun 27, 2022Updated 4 years ago
TheBrainLab / SNN-Neural-Similarity-Static
View on GitHub
Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse
☆18Oct 20, 2024Updated last year
zyushun / Adam-mini
View on GitHub
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
☆456May 13, 2025Updated last year
PSCLab-ASU / Systolic-CNN
View on GitHub
☆18Feb 13, 2021Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SunDoge / typed-args
View on GitHub
Parse command line arguments by defining dataclasses
☆13Updated this week
maxehre / polynomial_surrogates
View on GitHub
Tools to construct surrogate models based on Hermitian polynomial bases. Includes full-factorial and sparse polynomial chaos expansions v…
☆10Nov 8, 2018Updated 7 years ago
katahiromz / MZC4
View on GitHub
Mad Zombie Classic 4th
☆14Oct 8, 2023Updated 2 years ago
OPTML-Group / DP4TL
View on GitHub
[NeurIPS2023] "Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning" by Yihua Zhang*, Yimeng Zhang*,…
☆14Oct 12, 2023Updated 2 years ago
Unparalleled-Calvin / Fudan-course-search
View on GitHub
☆10Jul 6, 2021Updated 5 years ago
lecoan / pytorch-RLE
View on GitHub
A implement of run-length encoding for Pytorch tensor using CUDA
☆14Apr 7, 2021Updated 5 years ago
slcz / gomoku-deep-learning
View on GitHub
gomoku AI with deep learning and monte carlo tree search
☆19Mar 23, 2018Updated 8 years ago