Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer
☆26Feb 11, 2025Updated last year
Alternatives and similar repositories for HiZOO
Users that are interested in HiZOO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"☆12Jun 25, 2024Updated last year
- [ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Di…☆71Oct 9, 2024Updated last year
- ☆36May 28, 2024Updated last year
- Pytorch implementation of KFAC - this is a port of https://github.com/tensorflow/kfac/☆31Jun 6, 2024Updated last year
- This project is a implementation in PyTorch for ZO-AdaMU optimization: Adapting Perturbation with the Momentum and Uncertainty in Zeroth-…☆14Dec 12, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Parse command line arguments by defining dataclasses☆13Oct 13, 2024Updated last year
- A implement of run-length encoding for Pytorch tensor using CUDA☆14Apr 7, 2021Updated 5 years ago
- [EMNLP 24] Source code for paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tu…☆13Dec 15, 2024Updated last year
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated last month
- Wrapper for Ckmeans.1d.dp.☆13Mar 20, 2025Updated last year
- Implementation of the FedPM framework by the authors of the ICLR 2023 paper "Sparse Random Networks for Communication-Efficient Federated…☆31Feb 10, 2023Updated 3 years ago
- This notebook presents an example of the equal risk pricing framework with deep hedging from my paper Carbonneau, A. and Godin, F. (2020)…☆15Oct 15, 2021Updated 4 years ago
- A repository to introduce the algorithmic information theory. You could learn what is Kolmogorov complexity and why it is important here.☆13Jul 23, 2025Updated 9 months ago
- ☆11Dec 8, 2016Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Randomized algorithm class at CU☆15Jul 8, 2025Updated 9 months ago
- ☆12Aug 17, 2022Updated 3 years ago
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆13Jun 7, 2023Updated 2 years ago
- [NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333☆1,165Jan 11, 2024Updated 2 years ago
- ☆13Sep 25, 2023Updated 2 years ago
- A toolkit for constructing schedule-abstraction graph in Python☆12Feb 7, 2025Updated last year
- ☆13Jul 6, 2023Updated 2 years ago
- LLMA = LLM + Arithmetic coder, which use LLM to do insane text data compression. LLMA=大模型+算术编码,它能使用LLM对文本数据进行暴力的压缩,达到极高的压缩率。☆22Nov 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Provides the code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Luk…☆19Oct 6, 2019Updated 6 years ago
- ☆16Aug 25, 2021Updated 4 years ago
- ☆15Feb 20, 2024Updated 2 years ago
- ☆12Jul 6, 2022Updated 3 years ago
- ☆14Feb 2, 2021Updated 5 years ago
- A repository for LotteryFL re-implementation and experiments☆13Dec 18, 2020Updated 5 years ago
- PyTorch optimizers with sparse momentum and weight decay☆10Oct 3, 2020Updated 5 years ago
- Performing Symbolic Regression via Monte Carlo Tree Search (MCTS)☆14Nov 2, 2018Updated 7 years ago
- [TMLR 2026 J2C Certification] Previously at GenBio ICML 2025☆19Updated this week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Implementation of Gradient Information Optimization (GIO) for effective and scalable training data selection☆14Jun 22, 2023Updated 2 years ago
- Методы оптимизации в ML☆44Apr 20, 2026Updated last week
- Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering, ACL 2025☆21Oct 28, 2025Updated 6 months ago
- Task Aware Downscaling for efficient storing and accurate reconstruction in image and video domain☆12Jul 25, 2024Updated last year
- ☆12Nov 21, 2024Updated last year
- Initially a fork of the GitHub repository for the paper "Informer" accepted by AAAI 2021. Heavily modified since then.☆15Apr 7, 2023Updated 3 years ago
- A 12 week intensive course about Python and data science.☆21May 3, 2024Updated last year