Confidence Regulation Neurons in Language Models (NeurIPS 2024)
☆15Feb 1, 2025Updated last year
Alternatives and similar repositories for confidence-regulation-neurons
Users that are interested in confidence-regulation-neurons are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆29Jun 4, 2024Updated last year
- [ICLR 2025] Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron☆30Apr 30, 2025Updated 11 months ago
- [ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆24Mar 16, 2025Updated last year
- This code accompanies the paper "Bayesian Framework for Information-Theoretic Probing" published in EMNLP 2021.☆10Aug 23, 2021Updated 4 years ago
- Official Implementation of SSFF, Startup Success Forecasting Framework☆17Aug 31, 2025Updated 6 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".☆15Oct 14, 2022Updated 3 years ago
- Code for TACL 2020 paper "An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models"☆14Jul 31, 2020Updated 5 years ago
- ☆10May 17, 2024Updated last year
- A single script to facilitate submitting papers to ArXiv.org☆18Apr 14, 2018Updated 7 years ago
- Teaching machines to spell with deep learning (acc=>80%) e.g. a model hears "pɹˈaʊd˺ɚ" and writes "prowder" (but it should be "prouder")☆19Jun 1, 2017Updated 8 years ago
- SimKO: Simple Pass@K Policy Optimization☆28Oct 24, 2025Updated 5 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆86Jun 20, 2025Updated 9 months ago
- Course materials for the MVA course "algorithms for speech and language processing"☆12Mar 29, 2023Updated 3 years ago
- ☆11Mar 24, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆73Mar 6, 2025Updated last year
- Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on real-world survey data!☆25Dec 14, 2025Updated 3 months ago
- Code for Principal Masked Autoencoders☆31Feb 4, 2026Updated last month
- Word acquisition in neural language models (TACL 2022).☆20Jan 30, 2025Updated last year
- ☆12Jan 25, 2025Updated last year
- A tool for classifying errors in coreference resolution☆28Jun 27, 2023Updated 2 years ago
- ☆42Jun 11, 2025Updated 9 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆57May 28, 2025Updated 10 months ago
- This is the repository for the Interspeech 2018 paper "Coherence models for dialogue".☆19Jan 9, 2020Updated 6 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for ICCV2025 paper——IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves☆17Jul 11, 2025Updated 8 months ago
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆42Mar 19, 2026Updated last week
- Code and full version of the paper "Hijacking Attacks against Neural Network by Analyzing Training Data"☆14Feb 28, 2024Updated 2 years ago
- ☆13Oct 21, 2021Updated 4 years ago
- ☆14Jun 6, 2023Updated 2 years ago
- ☆11Apr 3, 2024Updated last year
- Data and code for the paper: Finding Safety Neurons in Large Language Models☆25Jan 29, 2026Updated 2 months ago
- Python API for loading language data from American-English CHILDES database☆18Aug 14, 2022Updated 3 years ago
- All my experiments with the various transformers and various transformer frameworks available☆14Apr 30, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An evaluation framework for mitigating DNN backdoor attacks using data augmentations☆11Dec 10, 2020Updated 5 years ago
- Official implementation of Visco-Attack (EMNLP 2025 Main). We will progressively release the code and one-click reproduction scripts.☆30Aug 22, 2025Updated 7 months ago
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆33Mar 2, 2025Updated last year
- 强化学习课程,主要是如何用强化学习解决问题☆15Dec 10, 2024Updated last year
- ☆14Jul 7, 2021Updated 4 years ago
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆22Feb 5, 2026Updated last month
- Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models☆32Oct 6, 2025Updated 5 months ago