vectozavr/llm-hessian

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/vectozavr/llm-hessian)

vectozavr / llm-hessian

Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models

☆29

Alternatives and similar repositories for llm-hessian

Users that are interested in llm-hessian are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

song-wx / SIFT
View on GitHub
[ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely
☆24Jun 26, 2024Updated 2 years ago
LIONS-EPFL / scion
View on GitHub
☆70Apr 8, 2026Updated 3 months ago
zyushun / hessian-spectrum
View on GitHub
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆65Mar 11, 2025Updated last year
burlachenkok / flpytorch
View on GitHub
FL_PyTorch: Optimization Research Simulator for Federated Learning
☆35Jul 7, 2023Updated 3 years ago
fjzzq2002 / random_transformers
View on GitHub
Official code for "Algorithmic Capabilities of Random Transformers" (NeurIPS 2024)
☆15Sep 28, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
JHU-CLSP / RATIONALYST
View on GitHub
Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044
☆36Oct 3, 2024Updated last year
liziniu / HyperDQN
View on GitHub
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
☆12Nov 28, 2023Updated 2 years ago
xie-lab-ml / Mano-Restriking-Manifold-Optimization-for-LLM-Training
View on GitHub
The official code of "Mano: Restriking Manifold Optimization for LLM Training".
☆25Jun 1, 2026Updated last month
xie-lab-ml / Meissonic-Inference
View on GitHub
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
☆16Nov 21, 2024Updated last year
Yanjun-Zhao / HiZOO
View on GitHub
Second-Order Fine-Tuning without Pain for LLMs: a Hessian Informed Zeroth-Order Optimizer
☆26Feb 11, 2025Updated last year
eric-mitchell / concord
View on GitHub
☆14Nov 15, 2022Updated 3 years ago
causalNLP / amr_llm
View on GitHub
This repo explores how AMR to address tasks difficult for LLMs
☆13Jan 15, 2024Updated 2 years ago
MachineLearningSystem / 25ASPLOS-Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆12Nov 8, 2024Updated last year
qizhangli / Gradient-based-Jailbreak-Attacks
View on GitHub
Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs
☆12Nov 7, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
wenquanlu / huginn-latent-cot
View on GitHub
[COLM 2025: 1st Workshop on the Application of LLM Explainability to Reasoning and Planning] Latent Chain-of-Thought? Decoding the Depth-…
☆20Oct 4, 2025Updated 9 months ago
rmin2000 / adv_tracing
View on GitHub
Identification of the Adversary from a Single Adversarial Example (ICML 2023)
☆10Jul 15, 2024Updated 2 years ago
technion-cs-nlp / irm-for-nli
View on GitHub
☆11Jun 2, 2022Updated 4 years ago
konstmish / opt_methods
View on GitHub
Benchmarking optimization methods on convex problems.
☆36Aug 8, 2025Updated 11 months ago
NielsRogge / tapas_utils
View on GitHub
A package containing utils for the PyTorch version of the Tapas algorithm.
☆11Apr 29, 2021Updated 5 years ago
ckbjimmy / clneg
View on GitHub
Clinical Text Summarization with Syntax-Based Negation and Semantic Concept Identification
☆21Mar 3, 2020Updated 6 years ago
erdavids / Hex_Map_Tutorial
View on GitHub
☆10Apr 28, 2020Updated 6 years ago
richardodliu / OpenCodeEval
View on GitHub
☆52Mar 9, 2026Updated 4 months ago
sanketx / AL-foundation-models
View on GitHub
Active Learning in the era of Foundation Models
☆14Apr 16, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Codewarnings / Six-in-a-row
View on GitHub
该六子棋程序使用Java语言编写,内置AI落子,主要由阿尔法贝塔搜索+评估函数实现,存在一定的bug,智能方面还行吧
☆13Jul 24, 2021Updated 5 years ago
leloykun / adaptive-muon
View on GitHub
A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they cha…
☆19Jan 11, 2025Updated last year
princeton-pli / what-makes-good-rm
View on GitHub
[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective
☆44Sep 18, 2025Updated 10 months ago
r2llab / GTTA
View on GitHub
This codebase is to reproduce the results of the paper "Grounded Test-Time Adaptation for LLM Agents".
☆17Mar 4, 2026Updated 4 months ago
Laborieux-Axel / Equilibrium-Propagation
View on GitHub
EP
☆20Mar 9, 2021Updated 5 years ago
Unakar / Spectral-Sphere-Optimizer
View on GitHub
Spectral Sphere Optimizer
☆131Mar 23, 2026Updated 4 months ago
facebookresearch / scalable-curvature
View on GitHub
Code for Dayal Kalra's research internship on scalable curvature measures for neural networks.
☆29Feb 3, 2026Updated 5 months ago
HanjieChen / REV
View on GitHub
Code for the paper "REV: Information-Theoretic Evaluation of Free-Text Rationales"
☆16Aug 11, 2023Updated 2 years ago
SNU-ARC / any-precision-llm
View on GitHub
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
☆130Jul 4, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ruz048 / AutoLoRA
View on GitHub
☆10Apr 16, 2024Updated 2 years ago
Outsider565 / LoRA-GA
View on GitHub
☆219Nov 25, 2025Updated 8 months ago
renll / SeqBoat
View on GitHub
[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
☆40Dec 2, 2023Updated 2 years ago
AweAI-Team / ScaleSWE
View on GitHub
☆87Jul 21, 2026Updated last week
AntiBargu / RUC-YOJ
View on GitHub
中国人民大学 YOJ 题库
☆12Jun 9, 2022Updated 4 years ago
amazon-science / comm-prompt
View on GitHub
CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL 2024 Findings))
☆16Apr 26, 2024Updated 2 years ago
liziniu / KnapsackRL
View on GitHub
☆19Oct 30, 2025Updated 8 months ago