LRudL/evalugator

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LRudL/evalugator)

LRudL / evalugator

(Model-written) LLM evals library

☆18

Alternatives and similar repositories for evalugator

Users that are interested in evalugator are comparing it to the libraries listed below

Sorting:

alan-cooney / transformer-lens-starter-template
View on GitHub
A quick way to get started with Transformer Lens
☆14Dec 13, 2023Updated 2 years ago
mishajw / repeng
View on GitHub
Experiments with representation engineering
☆13Feb 28, 2024Updated 2 years ago
LRudL / sad
View on GitHub
Situational Awareness Dataset
☆46Dec 14, 2024Updated last year
alignedai / HappyFaces
View on GitHub
The Happy Faces Benchmark
☆15Jul 20, 2023Updated 2 years ago
timaeus-research / devinterp
View on GitHub
Tools for studying developmental interpretability in neural networks.
☆125Dec 30, 2025Updated 2 months ago
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆27Nov 20, 2024Updated last year
cgarciae / einop
View on GitHub
☆60Mar 8, 2022Updated 3 years ago
longtermrisk / marltoolbox
View on GitHub
A toolbox with the goal of speeding up research on bargaining in MARL (cooperation problems in MARL).
☆32Sep 29, 2022Updated 3 years ago
alan-cooney / transformer-from-scratch
View on GitHub
Decoder only transformer, built from scratch with PyTorch
☆33Oct 22, 2023Updated 2 years ago
tripos-education / maths-tripos-questions
View on GitHub
Archive of questions from the Cambridge Mathematics Tripos
☆10Jun 6, 2022Updated 3 years ago
redwoodresearch / mlab
View on GitHub
Machine Learning for Alignment Bootcamp
☆82Apr 27, 2022Updated 3 years ago
amack315 / unsupervised-steering-vectors
View on GitHub
☆36Apr 30, 2024Updated last year
DavisPL / PCCC
View on GitHub
Proof-carrying code completions in Dafny
☆11Apr 4, 2025Updated 10 months ago
METR / task-standard
View on GitHub
METR Task Standard
☆177Feb 3, 2025Updated last year
MichaelEinhorn / trl-textworld
View on GitHub
☆13May 7, 2023Updated 2 years ago
UlisseMini / ana
View on GitHub
The AI that helps you achieve your goals
☆11Feb 4, 2024Updated 2 years ago
manifoldai / reinforcement_learning_public
View on GitHub
Our work on Reinforcement learning that we share with the rest of the world
☆13Jan 7, 2019Updated 7 years ago
anthropics / evals
View on GitHub
☆329Jul 2, 2024Updated last year
J-DM / Roth-Peranson
View on GitHub
A python implementation of the nobel prize winning matching algorithm.
☆10Nov 22, 2023Updated 2 years ago
OliverEvans96 / maturin-nix-example
View on GitHub
☆15Jul 21, 2023Updated 2 years ago
monasticacademy / logical-induction
View on GitHub
Code to support the guide to logical induction for software engineers
☆11Mar 24, 2025Updated 11 months ago
gstew5 / games
View on GitHub
☆10Feb 25, 2020Updated 6 years ago
nateraw / modal-examples
View on GitHub
Apps that run on modal.com
☆13Sep 14, 2025Updated 5 months ago
Currie32 / Predicting-Credit-Card-Fraud
View on GitHub
Used TensorFlow to build a neural network that can predict fraudulent credit card transactions.
☆10Jun 21, 2017Updated 8 years ago
emilyalbini / areweawaityet
View on GitHub
Track Rust's await bikeshedding
☆12Jul 7, 2020Updated 5 years ago
SandroLuck / TutorialMachineLearningAPI
View on GitHub
Tutorial accompanying the youtube video .. meant for educational purposes. The Machine Learning done in this project is very simplistic a…
☆11Feb 10, 2021Updated 5 years ago
aliakh / demo-ci-cd-lambda-function
View on GitHub
'Building a CI/CD pipeline for a Lambda function using AWS CodePipeline' article and source code.
☆12Oct 31, 2021Updated 4 years ago
meiersi / scyther-proof
View on GitHub
A tool for the automatic generation of Isabelle/HOL correctness proofs for security protocols.
☆18Jun 21, 2015Updated 10 years ago
noemaresearch / pinboard
View on GitHub
Pin files for contextual, codebase-level AI assistance.
☆16Jul 11, 2024Updated last year
drivendataorg / floodwater-runtime
View on GitHub
Code execution runtime for the STAC Overflow: Map Floodwater from Radar Imagery competition
☆12Sep 29, 2021Updated 4 years ago
graninas / cpp_lenses
View on GitHub
Functional lenses on C++ (highly experimental).
☆10Aug 13, 2019Updated 6 years ago
rob-luke / BIDS-NIRS-Tapping
View on GitHub
Example fNIRS BIDS dataset
☆14Nov 4, 2022Updated 3 years ago
taeokimeng / image-segmentation-streamlit
View on GitHub
Image Segmentation with PixelLib and Streamlit
☆11May 12, 2021Updated 4 years ago
chasenorman / Formalized-Voting
View on GitHub
☆13Jul 24, 2021Updated 4 years ago
gallais / agdARGS
View on GitHub
Dealing with Flags and Options
☆13Sep 10, 2021Updated 4 years ago
Cadenza-Labs / sleeper-agents
View on GitHub
☆12Jul 12, 2024Updated last year
callummcdougall / sae_vis
View on GitHub
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆245Dec 16, 2024Updated last year
seaneberhard / latex2wp
View on GitHub
Tool for converting LaTeX-prepared documents to Wordpress-ready HTML
☆14Mar 11, 2025Updated 11 months ago
Breakend / SelfDestructingModels
View on GitHub
☆13Aug 9, 2023Updated 2 years ago