Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models
☆29Apr 17, 2025Updated last year
Alternatives and similar repositories for llm-hessian
Users that are interested in llm-hessian are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for "Algorithmic Capabilities of Random Transformers" (NeurIPS 2024)☆16Sep 28, 2024Updated last year
- Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)☆12Nov 28, 2023Updated 2 years ago
- Minimalistic port of NanoGUI claim works with SDL API w/o external dependencies.☆12Sep 4, 2019Updated 6 years ago
- Irene is a python package that aims to be a toolkit for global optimization problems that can be realized algebraically. It generalizes L…☆15Updated this week
- Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer☆16Nov 21, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An approach for Circuit Synthesis using Dataset Threshold queries.☆14May 28, 2023Updated 2 years ago
- ☆16Oct 14, 2020Updated 5 years ago
- Links to publications that focus on the interpretation and analysis of in-context learning☆15Oct 17, 2024Updated last year
- This repo explores how AMR to address tasks difficult for LLMs☆13Jan 15, 2024Updated 2 years ago
- Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]☆12Nov 8, 2024Updated last year
- [NeurIPS 2025] Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking☆30Mar 18, 2026Updated last month
- ☆11Jun 2, 2022Updated 3 years ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- Official Implementation of wd1☆28Sep 25, 2025Updated 6 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆62Apr 8, 2026Updated last week
- [TPAMI] The official implementation of our paper "Improved and Accelerated Text-to-Image Generation with Collect, Reflect, and Refine".☆31Mar 8, 2026Updated last month
- PeRL: Parameter-Efficient Reinforcement Learning☆74Apr 6, 2026Updated last week
- 该六子棋程序使用Java语言编写,内置AI落子,主要由阿尔法贝塔搜索+评估函数实现,存在一定的bug,智能方面还行吧☆12Jul 24, 2021Updated 4 years ago
- [ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention☆53Aug 6, 2025Updated 8 months ago
- Minimal Academic Website Template☆14Feb 20, 2025Updated last year
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- Codebase for the Progressive Mixed-Precision Decoding paper.☆19Jul 15, 2025Updated 9 months ago
- [ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely☆24Jun 26, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Perform bayesian distribution regression☆13Mar 19, 2018Updated 8 years ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆123Jul 4, 2025Updated 9 months ago
- Models and code for the ICLR 2020 workshop paper "Towards Understanding Normalization in Neural ODEs"☆16Apr 27, 2020Updated 5 years ago
- A benchmark of real-world DL kernel problems☆177Updated this week
- ☆25Apr 3, 2025Updated last year
- ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling☆119Mar 31, 2026Updated 2 weeks ago
- This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attrib…☆36Jul 14, 2025Updated 9 months ago
- ☆30Jul 22, 2024Updated last year
- Arbitrary Distribution Modeling with Censorship in Real Time Bidding Advertising for KDD'22☆16Mar 9, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- nanoGPT-like codebase for LLM training☆117Nov 7, 2025Updated 5 months ago
- Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its…☆21Sep 10, 2024Updated last year
- Official Implementation for Inference-time Scaling of Diffusion Models through Classical Search☆31Oct 8, 2025Updated 6 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- ☆34Dec 15, 2024Updated last year
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆248Jun 15, 2025Updated 10 months ago
- ☆16Mar 19, 2026Updated last month