sambowyer / bayes_evalsLinks
A lightweight library for Bayesian analysis of LLM evals (ICML 2025 Spotlight Position Paper)
☆21Updated 6 months ago
Alternatives and similar repositories for bayes_evals
Users that are interested in bayes_evals are comparing it to the libraries listed below
Sorting:
- Extending Conformal Prediction to LLMs☆68Updated last year
- SDLG is an efficient method to accurately estimate aleatoric semantic uncertainty in LLMs☆27Updated last year
- Course Materials for Interpretability of Large Language Models (0368.4264) at Tel Aviv University☆201Updated last week
- Attribution-based Parameter Decomposition☆32Updated 5 months ago
- ☆111Updated 9 months ago
- ☆38Updated 7 months ago
- Probabilistic programming with large language models☆144Updated last week
- ☆79Updated last year
- PyTorch library for Active Fine-Tuning☆95Updated 2 months ago
- Erasing concepts from neural representations with provable guarantees☆239Updated 10 months ago
- ☆24Updated 7 months ago
- This is the repository for the CONFLARE (CONformal LArge language model REtrieval) Python package.☆20Updated last year
- ☆143Updated 2 months ago
- Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).☆92Updated 3 months ago
- Code for "Counterfactual Token Generation in Large Language Models", Arxiv 2024.☆30Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Updated 5 months ago
- Largest, cross-domain data set of human behavior.☆84Updated 4 months ago
- ☆68Updated last year
- Understanding how features learned by neural networks evolve throughout training☆39Updated last year
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Updated 7 months ago
- Interpret text data with LLMs (sklearn compatible).☆171Updated last month
- relplot: Utilities for measuring calibration and plotting reliability diagrams☆173Updated 3 weeks ago
- Discovering Data-driven Hypotheses in the Wild☆118Updated 5 months ago
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆27Updated 3 weeks ago
- ☆87Updated this week
- A package for statistically rigorous scientific discovery using machine learning. Implements prediction-powered inference.☆265Updated 2 months ago
- A library for calibrating classifiers and computing calibration metrics☆14Updated 3 years ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆49Updated 11 months ago
- Testing Language Models for Memorization of Tabular Datasets.☆36Updated 9 months ago
- A collection of various LLM sampling methods implemented in pure Pytorch☆26Updated 11 months ago