VITA-Group/Q-GaLore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VITA-Group/Q-GaLore)

VITA-Group / Q-GaLore

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.

☆205

Alternatives and similar repositories for Q-GaLore

Users that are interested in Q-GaLore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jiaweizzhao / GaLore
View on GitHub
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
☆1,698Oct 28, 2024Updated last year
VITA-Group / WeLore
View on GitHub
[ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications
☆52Oct 30, 2025Updated 8 months ago
drarijitdas / Natural-GaLore
View on GitHub
An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace
☆19Oct 21, 2024Updated last year
OpenGVLab / EfficientQAT
View on GitHub
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
☆342Apr 10, 2026Updated 2 months ago
whyNLP / LCKV
View on GitHub
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…
☆156Apr 7, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
TianjinYellow / SPAM-Optimizer
View on GitHub
☆36Mar 12, 2025Updated last year
fishiatee / Tumera
View on GitHub
Yet another frontend for LLM, written using .NET and WinUI 3
☆11Sep 14, 2025Updated 9 months ago
zyushun / Adam-mini
View on GitHub
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
☆458May 13, 2025Updated last year
microsoft / GRIN-MoE
View on GitHub
GRadient-INformed MoE
☆264Sep 25, 2024Updated last year
sanyalsunny111 / LLM-Inheritune
View on GitHub
[TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
☆126Mar 6, 2026Updated 3 months ago
yidingjiang / ado
View on GitHub
The repository contains code for Adaptive Data Optimization
☆36Dec 9, 2024Updated last year
Zyphra / Zamba2
View on GitHub
PyTorch implementation of models from the Zamba2 series.
☆193Jan 23, 2025Updated last year
Cornell-RelaxML / qtip
View on GitHub
☆179Jun 22, 2025Updated last year
kongds / MoRA
View on GitHub
MoRA: High-Rank Updating for Parameter-Efﬁcient Fine-Tuning
☆362Aug 7, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
OpenMOSS / Lorsa
View on GitHub
☆30Nov 9, 2025Updated 7 months ago
microsoft / Samba
View on GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆965Nov 16, 2025Updated 7 months ago
zqOuO / GWT
View on GitHub
☆13May 4, 2026Updated 2 months ago
mit-han-lab / omniserve
View on GitHub
[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Se…
☆845Mar 6, 2025Updated last year
microsoft / VPTQ
View on GitHub
VPTQ, A Flexible and Extreme low-bit quantization algorithm
☆680Apr 25, 2025Updated last year
attashe / ModifiedBeamSampler
View on GitHub
Modified Beam Search with periodical restart
☆12Sep 12, 2024Updated last year
wdlctc / mini-s
View on GitHub
☆51Oct 29, 2024Updated last year
facebookresearch / LayerSkip
View on GitHub
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
☆373Apr 13, 2026Updated 2 months ago
safety-research / inverse-scaling-ttc
View on GitHub
Inverse Scaling in Test-Time Compute
☆25Dec 3, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MadsToftrup / Apollo-dev
View on GitHub
☆17Dec 9, 2024Updated last year
evanatyourservice / llm-jax
View on GitHub
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆19Jul 24, 2025Updated 11 months ago
RobertCsordas / moeut
View on GitHub
☆93Aug 18, 2024Updated last year
princeton-nlp / SimPO
View on GitHub
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
☆954Feb 16, 2025Updated last year
TianjinYellow / StableSPAM
View on GitHub
☆28Updated this week
goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
NVlabs / COAT
View on GitHub
[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training
☆264Aug 9, 2025Updated 10 months ago
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
VijayLingam95 / SVFT
View on GitHub
☆35Feb 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zhuhanqing / APOLLO
View on GitHub
APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention
☆365Nov 29, 2025Updated 7 months ago
microsoft / TransformerCompression
View on GitHub
For releasing code related to compression methods for transformers, accompanying our publications
☆462Jan 16, 2025Updated last year
intel / auto-round
View on GitHub
A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support…
☆1,503Updated this week
hahnyuan / PB-LLM
View on GitHub
PB-LLM: Partially Binarized Large Language Models
☆157Nov 20, 2023Updated 2 years ago
dayal-kalra / low-memory-adam
View on GitHub
☆14Mar 2, 2025Updated last year
UNITES-Lab / MC-SMoE
View on GitHub
[ICLR‘24 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
☆107Jun 20, 2025Updated last year
apple / ml-sigmoid-attention
View on GitHub
☆310Apr 23, 2025Updated last year