sambowyer / bayes_evalsLinks
A lightweight library for Bayesian analysis of LLM evals (ICML 2025 Spotlight Position Paper)
☆21Updated 3 months ago
Alternatives and similar repositories for bayes_evals
Users that are interested in bayes_evals are comparing it to the libraries listed below
Sorting:
- Extending Conformal Prediction to LLMs☆67Updated last year
- Attribution-based Parameter Decomposition☆30Updated 3 months ago
- Discovering Data-driven Hypotheses in the Wild☆111Updated 3 months ago
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆18Updated 3 months ago
- Interpret text data using LLMs (scikit-learn compatible).☆170Updated last month
- relplot: Utilities for measuring calibration and plotting reliability diagrams☆169Updated 2 months ago
- A collection of various LLM sampling methods implemented in pure Pytorch☆23Updated 9 months ago
- Probabilistic programming with large language models☆136Updated 2 months ago
- 🧠 Starter templates for doing interpretability research☆74Updated 2 years ago
- ☆32Updated 5 months ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆129Updated 3 years ago
- Portfolio REgret for Confidence SEquences☆20Updated 9 months ago
- ☆22Updated 5 months ago
- A statistical toolkit for scientific discovery using machine learning☆80Updated last year
- ☆229Updated last month
- ☆71Updated 3 weeks ago
- Testing Language Models for Memorization of Tabular Datasets.☆35Updated 7 months ago
- ☆107Updated last year
- SDLG is an efficient method to accurately estimate aleatoric semantic uncertainty in LLMs☆27Updated last year
- ☆108Updated 7 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆59Updated 10 months ago
- A Natural Language Interface to Explainable Boosting Machines☆68Updated last year
- A mechanistic approach for understanding and detecting factual errors of large language models.☆47Updated last year
- ☆81Updated 7 months ago
- PyTorch library for Active Fine-Tuning☆91Updated 2 weeks ago
- ☆142Updated 2 weeks ago
- ☆27Updated 2 years ago
- Erasing concepts from neural representations with provable guarantees☆234Updated 8 months ago
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28Updated last year