☆376Jul 2, 2024Updated last year
Alternatives and similar repositories for evals
Users that are interested in evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.☆15Jan 12, 2026Updated 3 months ago
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,840Jun 17, 2025Updated 10 months ago
- ☆260Dec 21, 2022Updated 3 years ago
- ☆28Sep 5, 2024Updated last year
- (Model-written) LLM evals library☆18Dec 13, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆283Mar 2, 2024Updated 2 years ago
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆218Apr 27, 2026Updated last week
- ☆14Jan 21, 2025Updated last year
- ☆22Sep 9, 2021Updated 4 years ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆118Oct 25, 2023Updated 2 years ago
- ☆1,077Mar 6, 2024Updated 2 years ago
- Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy☆49Updated this week
- ControlArena is a collection of settings, model organisms and protocols - for running control experiments.☆189Apr 27, 2026Updated last week
- A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations☆216Dec 22, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆145Sep 14, 2022Updated 3 years ago
- METR Task Standard☆179Feb 3, 2025Updated last year
- Evaluating the Moral Beliefs Encoded in LLMs☆36Dec 17, 2024Updated last year
- LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces☆104Sep 21, 2023Updated 2 years ago
- ☆103Mar 4, 2024Updated 2 years ago
- ☆10Feb 2, 2024Updated 2 years ago
- Measuring the situational awareness of language models☆41Feb 12, 2024Updated 2 years ago
- A library for mechanistic interpretability of GPT-style language models☆3,383Updated this week
- ☆12Oct 23, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆16Sep 15, 2023Updated 2 years ago
- Representation Engineering: A Top-Down Approach to AI Transparency☆989Aug 14, 2024Updated last year
- ☆121Jan 19, 2026Updated 3 months ago
- Interactive Composition Explorer: a debugger for compositional language model programs☆569Apr 6, 2026Updated 3 weeks ago
- ☆27Mar 13, 2024Updated 2 years ago
- Mechanistic Interpretability for Transformer Models☆53Jun 1, 2022Updated 3 years ago
- ☆41Feb 11, 2025Updated last year
- Tools for understanding how transformer predictions are built layer-by-layer☆586Aug 7, 2025Updated 8 months ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆307Mar 1, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆30May 23, 2024Updated last year
- The AI that helps you achieve your goals☆11Feb 4, 2024Updated 2 years ago
- ☆14Jul 5, 2024Updated last year
- ☆30Jun 19, 2023Updated 2 years ago
- ☆2,556May 19, 2024Updated last year
- ☆562Feb 5, 2024Updated 2 years ago
- Display and customize Markdown text in SwiftUI☆46Jan 28, 2025Updated last year