drarijitdas/Natural-GaLore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/drarijitdas/Natural-GaLore)

drarijitdas / Natural-GaLore

An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace

☆19

Alternatives and similar repositories for Natural-GaLore

Users that are interested in Natural-GaLore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zqOuO / GWT
View on GitHub
☆13May 4, 2026Updated 2 months ago
p1nksnow / MoICE
View on GitHub
Official implementation for "Mixture of In-Context Experts Enhance LLMs’ Awareness of Long Contexts" (Accepted by Neurips2024)
☆14Jan 7, 2025Updated last year
IGITUGraz / SparseAdversarialTraining
View on GitHub
Code for "Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling" [ICML 2021]
☆10Mar 14, 2022Updated 4 years ago
VITA-Group / WeLore
View on GitHub
[ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications
☆52Oct 30, 2025Updated 8 months ago
RUCKBReasoning / CodeRM
View on GitHub
Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'
☆27May 16, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
safety-research / inverse-scaling-ttc
View on GitHub
Inverse Scaling in Test-Time Compute
☆26Dec 3, 2025Updated 7 months ago
kyleliang919 / Online-Subspace-Descent
View on GitHub
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆33Jul 1, 2025Updated last year
keeeeenw / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆14Mar 30, 2024Updated 2 years ago
linkedin / ControlLLM
View on GitHub
Control LLM
☆23Apr 6, 2025Updated last year
uynaes / RankingAwareCLIP
View on GitHub
[ICLR'25] Official repository of paper: Ranking-aware adapter for text-driven image ordering with CLIP
☆16Apr 17, 2025Updated last year
mtharrison / promptscaper
View on GitHub
A client-only OpenAI LLM Playground for prototyping agents without writing any code.
☆22Aug 31, 2023Updated 2 years ago
gallen881 / Physics_Master
View on GitHub
Physics Master is a model fine-tuned from llama3-8B-Instruct. It can answer your physics question!
☆16Aug 24, 2024Updated last year
Qichuzyy / POA
View on GitHub
Official implementation of ECCV24 paper: POA
☆24Aug 8, 2024Updated last year
op-rs / durin
View on GitHub
A Rust library for creating solvers in the OP Stack's dispute protocol
☆19Jan 15, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
btfranklin / promptdown
View on GitHub
A Python package that enables the creation and parsing of structured prompts for language models in markdown format
☆17Updated this week
VITA-Group / SEAL
View on GitHub
[COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆60Apr 6, 2025Updated last year
OpenEvaByte / evabyte
View on GitHub
EvaByte: Efficient Byte-level Language Models at Scale
☆119Apr 22, 2025Updated last year
HCY123902 / atg-w-fg-rw
View on GitHub
☆10May 27, 2024Updated 2 years ago
UNITES-Lab / MC-SMoE
View on GitHub
[ICLR‘24 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
☆108Jun 20, 2025Updated last year
GeorgeVern / lmcor
View on GitHub
Code for the EACL 2024 paper: "Small Language Models Improve Giants by Rewriting Their Outputs"
☆12Apr 20, 2024Updated 2 years ago
slashml / awesome-finetuning
View on GitHub
☆31Aug 27, 2024Updated last year
rungalileo / examples
View on GitHub
Examples of using Galileo for better ML data quality!!
☆13Feb 5, 2026Updated 5 months ago
UNITES-Lab / C2R-MoE
View on GitHub
[NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…
☆16Feb 4, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mzf666 / LORO-main
View on GitHub
Official implementation of ICLR 2025 'LORO: Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization'
☆17Apr 24, 2025Updated last year
archiki / UTGenDebug
View on GitHub
Code for our paper "Learning to Generate Unit Tests for Automated Debugging"
☆18Mar 7, 2025Updated last year
UNITES-Lab / Occult
View on GitHub
[ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…
☆13Apr 17, 2025Updated last year
PositionalHidden / PositionalHidden
View on GitHub
To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …
☆12Jun 18, 2024Updated 2 years ago
machilusZ / FastGen
View on GitHub
This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
☆44Aug 14, 2024Updated last year
linzhu123455 / spotify-skip-prediction-top-1-solution
View on GitHub
☆15Jan 11, 2019Updated 7 years ago
UnstableLlama / ezexl3
View on GitHub
easy exllama interface w/ automation & evals
☆17Jul 13, 2026Updated last week
Shen-Lab / Bayesian-L2O
View on GitHub
[ICLR 2022] "Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How" by Yuning You, Yue Cao, Tianl…
☆14Aug 19, 2022Updated 3 years ago
iDoka / hdl-secded-producer
View on GitHub
MATLAB/Octave generator of Hamming ECC coding. Output format is Verilog HDL.
☆12Dec 27, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
GDPlumb / ExpO
View on GitHub
Explanation Optimization
☆13Oct 16, 2020Updated 5 years ago
lucidrains / multiscreen
View on GitHub
Implementation of Multiscreen proposed by Ken Nakanishi for "Screening is Enough"
☆17May 13, 2026Updated 2 months ago
akhilkedia / TranformersGetStable
View on GitHub
[ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"
☆11Jul 19, 2024Updated 2 years ago
declare-lab / safety-arithmetic
View on GitHub
☆13Jan 14, 2025Updated last year
sgl-project / whl
View on GitHub
SGLang Kernel Wheel Index
☆24Updated this week
hussi9 / skill-router
View on GitHub
Skill + Agent + Model + Thinking depth — auto-routed before any tool fires. One SKILL.md for Claude Code. 90% routing accuracy, per-step …
☆15Jul 11, 2026Updated last week
apple / ml-hypercloning
View on GitHub
☆54Nov 3, 2024Updated last year