GuoTianYu2000/Active-Dormant-Attention

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/GuoTianYu2000/Active-Dormant-Attention)

GuoTianYu2000 / Active-Dormant-Attention

codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"

☆11

Alternatives and similar repositories for Active-Dormant-Attention

Users that are interested in Active-Dormant-Attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chenw20 / SGPA
View on GitHub
Example code of Sparse Gaussian Process Attention (ICLR 2023)
☆26Sep 15, 2025Updated 10 months ago
sjelassi / ebft_openrlhf
View on GitHub
Code for "Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models".
☆23Mar 16, 2026Updated 4 months ago
tim-lawson / mlsae
View on GitHub
Multi-Layer Sparse Autoencoders (ICLR 2025)
☆30Feb 6, 2026Updated 5 months ago
fangyuan-ksgk / selective-attention-transformer
View on GitHub
Unofficial Implementation of Selective Attention Transformer
☆20Oct 31, 2024Updated last year
zijwang / talkdown
View on GitHub
Dataset and pre-trained model of EMNLP-IJCNLP 2019 paper "TalkDown: A Corpus for Condescension Detection in Context."
☆10Jan 26, 2020Updated 6 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
franrruiz / augment-reduce
View on GitHub
Code for Augment & Reduce, a scalable stochastic algorithm for large categorical distributions
☆10May 16, 2018Updated 8 years ago
mayank31398 / ladder-residual-inference
View on GitHub
☆14Jul 13, 2025Updated last year
thomasahle / cce
View on GitHub
Clustered Compositional Embeddings
☆13Oct 25, 2023Updated 2 years ago
inria-thoth / csa
View on GitHub
Official Pytorch implementation of Chromatic Graph Transformers
☆10Jun 14, 2023Updated 3 years ago
brando90 / ultimate-anatome
View on GitHub
Ἀνατομή is a PyTorch library to analyze representation of neural networks
☆13Jan 31, 2024Updated 2 years ago
MachineLearningSystem / 25ASPLOS-Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆12Nov 8, 2024Updated last year
UCSC-REAL / FLAT
View on GitHub
[ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data
☆14Feb 26, 2025Updated last year
christopher-beckham / annotated-conditional-diffusion
View on GitHub
☆10Aug 26, 2022Updated 3 years ago
lindermanlab / hackathons
View on GitHub
Jupyter notebooks from our weekly (or so) hackathons
☆11Dec 3, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lzzcd001 / ade-code
View on GitHub
Code for "Exponential Family Estimation via Adversarial Dynamics Embedding" (NeurIPS 2019)
☆14Nov 26, 2019Updated 6 years ago
wtong98 / mlp-icl
View on GitHub
☆12Sep 16, 2024Updated last year
jopetty / word-problem
View on GitHub
Experiments on the impact of depth in transformers and SSMs.
☆44Oct 23, 2025Updated 9 months ago
dejwid / stack-overcloned
View on GitHub
☆12Jul 6, 2021Updated 5 years ago
damek / gd-lean
View on GitHub
☆37Feb 8, 2026Updated 5 months ago
pietrolesci / memorisation-profiles
View on GitHub
This is the official implementation for our ACL 2024 paper: "Causal Estimation of Memorisation Profiles".
☆25Mar 25, 2025Updated last year
fKunstner / noise-sgd-adam-sign
View on GitHub
☆16Apr 26, 2023Updated 3 years ago
xinmei9322 / semicrowd
View on GitHub
Code for Semi-crowdsourced Clustering with Deep Generative Models
☆12Dec 9, 2022Updated 3 years ago
aboustati / vargrad
View on GitHub
Code accompanying VarGrad: A Low-Variance Gradient Estimator for Variational Inference
☆12Oct 12, 2020Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
nhatpd / iADMM
View on GitHub
iADMM for a low-rank representation optimization problem
☆13Feb 5, 2021Updated 5 years ago
petergroth / kermut
View on GitHub
This is the official code repository for the paper NeurIPS 2024 spotlight paper "Kermut: Composite kernel regression for protein variant …
☆45Aug 5, 2025Updated 11 months ago
revelio-diffusion / revelio
View on GitHub
☆26Jun 29, 2025Updated last year
ratschlab / bnn_priors
View on GitHub
Code for the paper "Bayesian Neural Network Priors Revisited"
☆61Jul 1, 2021Updated 5 years ago
BaohaoLiao / frac-cot
View on GitHub
[COLM 2026] An efficient 3D sampling method for long-CoT LLM.
☆16May 25, 2025Updated last year
Bond1995 / Markov
View on GitHub
Code for experiments on transformers using Markovian data.
☆22Nov 22, 2024Updated last year
umich-sota / TF-as-SVM
View on GitHub
☆12Jan 17, 2024Updated 2 years ago
RLHFlow / RAFT
View on GitHub
This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…
☆43Sep 22, 2024Updated last year
YunyiShen / wild-posteriors-in-the-wild
View on GitHub
Posterior with interesting shapes from actually used models
☆13Feb 10, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
christopher-beckham / deep-unimodal-ordinal
View on GitHub
☆12Oct 21, 2017Updated 8 years ago
nshepperd / gumbel-rao-pytorch
View on GitHub
☆11Jul 25, 2021Updated 5 years ago
wesg52 / llm-context-neurons
View on GitHub
Find context neurons in Pythia models.
☆13Jun 13, 2023Updated 3 years ago
KempnerInstitute / llm_uncertainty
View on GitHub
Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"
☆11Jul 18, 2026Updated last week
ZQZCalin / trainit
View on GitHub
☆13Mar 10, 2026Updated 4 months ago
mururu / rafute
View on GitHub
An implementation of Raft Consensus Algorithm in Elixir
☆21Apr 3, 2016Updated 10 years ago
yafuly / PromptNMT
View on GitHub
☆15Dec 2, 2022Updated 3 years ago