Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"
☆11Apr 15, 2024Updated last year
Alternatives and similar repositories for llm_uncertainty
Users that are interested in llm_uncertainty are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆35Feb 20, 2025Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 11 months ago
- A tiny easily hackable implementation of a feature dashboard.☆16Oct 21, 2025Updated 5 months ago
- DESSERT Effeciently Searches Sets of Embeddings via Retrieval Tables☆17Feb 21, 2024Updated 2 years ago
- Provable Worst Case Guarantees for the Detection of Out-of-Distribution Data☆13Sep 20, 2022Updated 3 years ago
- A new simple method for dataset distillation called Randomized Truncated Backpropagation Through Time (RaT-BPTT)☆13Apr 21, 2024Updated last year
- A fast high dimensional near neighbor search algorithm based on group testing and locality sensitive hashing☆23Dec 9, 2023Updated 2 years ago
- Code for our paper "Decomposing The Dark Matter of Sparse Autoencoders"☆23Feb 6, 2025Updated last year
- Connecting Interpretability and Robustness in Decision Trees through Separation☆17May 8, 2021Updated 4 years ago
- Do input gradients highlight discriminative features? [NeurIPS 2021] (https://arxiv.org/abs/2102.12781)☆12Jan 10, 2023Updated 3 years ago
- Simulates a debate between two AI agents on a given topic☆14Oct 11, 2024Updated last year
- Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"☆32Mar 31, 2025Updated 11 months ago
- ☆14Jul 5, 2023Updated 2 years ago
- Bias Benchmark for Natural Language Inference. Code repo for the Findings of NAACL 2022 paper "On Measuring Social Biases in Prompt-Based…☆15Apr 28, 2022Updated 3 years ago
- ☆15Mar 29, 2025Updated 11 months ago
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆29Sep 22, 2023Updated 2 years ago
- MDL Complexity computations and experiments from the paper "Revisiting complexity and the bias-variance tradeoff".☆18Jun 12, 2023Updated 2 years ago
- Group-conditional DRO to alleviate spurious correlations☆15Jul 15, 2021Updated 4 years ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- The code to reproduce CVPR 2021 paper "Towards Robust Classification Model by Counterfactual and Invariant Data Generation"☆17Jul 29, 2021Updated 4 years ago
- About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…☆13Jan 14, 2026Updated 2 months ago
- ☆46Oct 11, 2023Updated 2 years ago
- ☆25Sep 3, 2025Updated 6 months ago
- ACL24☆11Jun 7, 2024Updated last year
- A high-performance implementation of GerryChain in Julia☆19Jan 3, 2022Updated 4 years ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆84Nov 27, 2024Updated last year
- Spurious Features Everywhere - Large-Scale Detection of Harmful Spurious Features in ImageNet☆32Aug 22, 2023Updated 2 years ago
- Pytorch routines for (Ker)nel (Mac)hines☆11Oct 10, 2025Updated 5 months ago
- Simple MoE - Day 17 of 365 Days of Repos☆17Jan 17, 2025Updated last year
- Comparison of gradient estimation techniques for black-box adversarial examples☆11Oct 31, 2018Updated 7 years ago
- The simplest repository for training medium-sized BackpackLM for cs224n☆25Aug 13, 2023Updated 2 years ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated last year
- ☆22Feb 5, 2026Updated last month
- Accelerating Transfer Learning with Robust Neural Nets☆11Oct 2, 2020Updated 5 years ago
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated last month
- ☆12Sep 16, 2024Updated last year
- ☆15Dec 7, 2021Updated 4 years ago