Zeroth-Order Fine-Tuning of LLMs in Random Subspaces (ICCV 2025)
☆19Nov 22, 2024Updated last year
Alternatives and similar repositories for SubZero
Users that are interested in SubZero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer☆24Feb 11, 2025Updated last year
- ☆19Dec 5, 2024Updated last year
- 4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)☆13Feb 13, 2025Updated last year
- [EMNLP 24] Source code for paper 'AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tu…☆13Dec 15, 2024Updated last year
- Official implementation of ICLR 2025 'LORO: Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization'☆16Apr 24, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆17Dec 7, 2025Updated 4 months ago
- [ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Di…☆71Oct 9, 2024Updated last year
- Fine-tuning Quantized Neural Networks with Zeroth-order Optimization☆17Sep 17, 2025Updated 6 months ago
- Code for "Thinking Forward: Memory-Efficient Federated Finetuning of Language Models" (NeurIPS 2024). Spry is a federated learning al…☆12Oct 8, 2024Updated last year
- This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)☆19Jan 9, 2025Updated last year
- Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative Decoding"[NeurIPS 2025]☆18May 12, 2025Updated 11 months ago
- Parse command line arguments by defining dataclasses☆13Oct 13, 2024Updated last year
- A implement of run-length encoding for Pytorch tensor using CUDA☆14Apr 7, 2021Updated 5 years ago
- Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"☆12Jun 25, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [TMM 2023] Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token☆14Mar 21, 2024Updated 2 years ago
- ☆11Dec 8, 2016Updated 9 years ago
- Audio Masking Methods☆12Nov 15, 2019Updated 6 years ago
- Two methods: Multi-Frame method with MAP&Total variation&sparse representation&total variation in time domain and Singe-Frame with CNN ne…☆13Nov 16, 2018Updated 7 years ago
- Large-scale Bound-constrained Optimization☆14Jul 27, 2021Updated 4 years ago
- Code for the article "Accelerated Forward-Backward Optimization using Deep Learning"☆12Sep 15, 2021Updated 4 years ago
- Experimenting with Lapped Transforms Jupyter Notebook☆14Jun 13, 2025Updated 10 months ago
- ☆11Jun 22, 2022Updated 3 years ago
- Beamer template based on Flux-beamer☆10Nov 13, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Provides the code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Luk…☆19Oct 6, 2019Updated 6 years ago
- AAAI2025☆12Apr 18, 2025Updated 11 months ago
- Repository to run the Tensor Convolutional Dictionary Learning with FISTA (TC-FISTA) algorithm☆12Jun 22, 2022Updated 3 years ago
- tensor rank learning in CP decomposition via convolutional neural network☆11Apr 19, 2018Updated 7 years ago
- This project implements optimizers for TensorFlow and Keras, which can be used in the same way as Keras optimizers. Machine learning, Dee…☆50Updated this week
- [IEEE ICASSP 2021] "A fast randomized adaptive CP decomposition for streaming tensors". In 46th IEEE International Conference on Acoustic…☆12Feb 16, 2023Updated 3 years ago
- Task Aware Downscaling for efficient storing and accurate reconstruction in image and video domain☆12Jul 25, 2024Updated last year
- An up-to-date list of progress made in next-generation AI.☆11Apr 2, 2023Updated 3 years ago
- some great skin for potplayer☆11Mar 2, 2018Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Parallel, Concurrent, and Distributed Programming in Java Specialization☆12Mar 4, 2019Updated 7 years ago
- Data for the scico project☆14Feb 3, 2026Updated 2 months ago
- ☆13Mar 11, 2023Updated 3 years ago
- ☆14Mar 28, 2022Updated 4 years ago
- survery of small language models☆18Jul 23, 2024Updated last year
- ☆14Sep 26, 2023Updated 2 years ago
- Robust natural language watermarking using invariant features☆28Oct 15, 2023Updated 2 years ago