bpwu1/confidence-regulation-neurons

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bpwu1/confidence-regulation-neurons)

bpwu1 / confidence-regulation-neurons

Confidence Regulation Neurons in Language Models (NeurIPS 2024)

☆15

Alternatives and similar repositories for confidence-regulation-neurons

Users that are interested in confidence-regulation-neurons are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tatsu-lab / linguistic_calibration
View on GitHub
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆30Jun 4, 2024Updated 2 years ago
zhaoyiran924 / Safety-Neuron
View on GitHub
[ICLR 2025] Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron
☆33Apr 30, 2025Updated last year
rycolab / bayesian-mi
View on GitHub
This code accompanies the paper "Bayesian Framework for Information-Theoretic Probing" published in EMNLP 2021.
☆10Aug 23, 2021Updated 4 years ago
aladinD / SafeMERGE
View on GitHub
Code for SafeMERGE (ICLR 2025).
☆15Apr 1, 2025Updated last year
alestolfo / causal-math
View on GitHub
Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".
☆15Oct 14, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lifu-tu / Study-NLP-Robustness
View on GitHub
Code for TACL 2020 paper "An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models"
☆14Jul 31, 2020Updated 5 years ago
xisen-w / Startup-Success-Forecasting-Framework
View on GitHub
Official Implementation of SSFF, Startup Success Forecasting Framework
☆16Aug 31, 2025Updated 10 months ago
wzhuang-xmu / LoSA
View on GitHub
[ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".
☆25Mar 16, 2025Updated last year
ArjunPanickssery / self_recognition
View on GitHub
☆10May 17, 2024Updated 2 years ago
kdu4108 / context-vs-prior-finetuning
View on GitHub
☆15May 27, 2025Updated last year
idramalab / arxiv_script
View on GitHub
A single script to facilitate submitting papers to ArXiv.org
☆18Apr 14, 2018Updated 8 years ago
javiferran / sae_entities
View on GitHub
☆78Mar 6, 2025Updated last year
gouki510 / Topology_of_Reasoning
View on GitHub
☆42Jun 11, 2025Updated last year
wassname / phoneme2grapheme
View on GitHub
Teaching machines to spell with deep learning (acc=>80%) e.g. a model hears "pɹˈaʊd˺ɚ" and writes "prowder" (but it should be "prouder")
☆19Jun 1, 2017Updated 9 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
edupoux / MVA_2023_SL
View on GitHub
Course materials for the MVA course "algorithms for speech and language processing"
☆13Mar 29, 2023Updated 3 years ago
alicebizeul / pmae
View on GitHub
Code for Principal Masked Autoencoders
☆31Feb 4, 2026Updated 5 months ago
zepingyu0512 / awesome-LLM-neuron
View on GitHub
☆36Jun 13, 2025Updated last year
jkkummerfeld / berkeley-coreference-analyser
View on GitHub
A tool for classifying errors in coreference resolution
☆29Jun 27, 2023Updated 3 years ago
tylerachang / word-acquisition-language-models
View on GitHub
Word acquisition in neural language models (TACL 2022).
☆21Jan 30, 2025Updated last year
RU-System-Software-and-Security / NIC
View on GitHub
☆12Mar 24, 2023Updated 3 years ago
socialfoundations / folktexts
View on GitHub
Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on real-world survey data!
☆29Jul 7, 2026Updated 3 weeks ago
alecervi / Coherence-models-for-dialogue
View on GitHub
This is the repository for the Interspeech 2018 paper "Coherence models for dialogue".
☆19Jan 9, 2020Updated 6 years ago
xi-j / Style-Talker
View on GitHub
An official implementation of Style-Talker for Spoken Dialogue Generation
☆23Jan 12, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
waltonfuture / Diff-eRank
View on GitHub
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆59May 28, 2025Updated last year
UIUCLearningLanguageLab / AOCHILDES
View on GitHub
Python API for loading language data from American-English CHILDES database
☆18Aug 14, 2022Updated 3 years ago
RUIYUN-ML / ERM-KTP
View on GitHub
☆11Apr 3, 2024Updated 2 years ago
lordtt13 / transformers-experiments
View on GitHub
All my experiments with the various transformers and various transformer frameworks available
☆14Apr 30, 2021Updated 5 years ago
swei2001 / RouteSAEs
View on GitHub
☆15Jan 2, 2026Updated 6 months ago
inspire-group / tta_risk
View on GitHub
☆15Jun 6, 2023Updated 3 years ago
TDteach / Demon-in-the-Variant
View on GitHub
☆13Oct 21, 2021Updated 4 years ago
Kim-Minseon / APGP
View on GitHub
Automatic Jailbreaking of the Text-to-Image Generative AI Systems
☆15Jun 23, 2024Updated 2 years ago
xiangruihu / bilibili
View on GitHub
☆15Aug 3, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MoroccoAI / Moroccans-top-AI-confs
View on GitHub
Curated list of Moroccans publishing in the most prestigious AI conferences
☆11Jul 6, 2026Updated 3 weeks ago
MarvinLvn / BabySLM
View on GitHub
Behavioral probing of language acquisition models at the lexical and syntactic level
☆20Jul 17, 2023Updated 3 years ago
CAMeL-Lab / Gumar-Ngrams
View on GitHub
The complete [1 to 5]-gram Gumar Corpus in the style of Google n-grams.
☆12Feb 5, 2020Updated 6 years ago
YiZeng623 / DeepSweep
View on GitHub
An evaluation framework for mitigating DNN backdoor attacks using data augmentations
☆11Dec 10, 2020Updated 5 years ago
yuki-younai / Jailbreak-R1
View on GitHub
offical implementation of Jailbreak-R1
☆15Jul 16, 2025Updated last year
j-luo93 / DecipherUnsegmented
View on GitHub
☆15Jul 7, 2021Updated 5 years ago
NISPLab / CleanSheet
View on GitHub
Code and full version of the paper "Hijacking Attacks against Neural Network by Analyzing Training Data"
☆14Feb 28, 2024Updated 2 years ago