(Model-written) LLM evals library
☆18Dec 13, 2024Updated last year
Alternatives and similar repositories for evalugator
Users that are interested in evalugator are comparing it to the libraries listed below
Sorting:
- A quick way to get started with Transformer Lens☆14Dec 13, 2023Updated 2 years ago
- Experiments with representation engineering☆13Feb 28, 2024Updated 2 years ago
- Situational Awareness Dataset☆46Dec 14, 2024Updated last year
- The Happy Faces Benchmark☆15Jul 20, 2023Updated 2 years ago
- Tools for studying developmental interpretability in neural networks.☆125Dec 30, 2025Updated 2 months ago
- Improving Steering Vectors by Targeting Sparse Autoencoder Features☆27Nov 20, 2024Updated last year
- ☆60Mar 8, 2022Updated 3 years ago
- A toolbox with the goal of speeding up research on bargaining in MARL (cooperation problems in MARL).☆32Sep 29, 2022Updated 3 years ago
- Decoder only transformer, built from scratch with PyTorch☆33Oct 22, 2023Updated 2 years ago
- Archive of questions from the Cambridge Mathematics Tripos☆10Jun 6, 2022Updated 3 years ago
- Machine Learning for Alignment Bootcamp☆82Apr 27, 2022Updated 3 years ago
- ☆36Apr 30, 2024Updated last year
- Proof-carrying code completions in Dafny☆11Apr 4, 2025Updated 10 months ago
- METR Task Standard☆177Feb 3, 2025Updated last year
- ☆13May 7, 2023Updated 2 years ago
- The AI that helps you achieve your goals☆11Feb 4, 2024Updated 2 years ago
- Our work on Reinforcement learning that we share with the rest of the world☆13Jan 7, 2019Updated 7 years ago
- ☆329Jul 2, 2024Updated last year
- A python implementation of the nobel prize winning matching algorithm.☆10Nov 22, 2023Updated 2 years ago
- ☆15Jul 21, 2023Updated 2 years ago
- Code to support the guide to logical induction for software engineers☆11Mar 24, 2025Updated 11 months ago
- ☆10Feb 25, 2020Updated 6 years ago
- Apps that run on modal.com☆13Sep 14, 2025Updated 5 months ago
- Used TensorFlow to build a neural network that can predict fraudulent credit card transactions.☆10Jun 21, 2017Updated 8 years ago
- Track Rust's await bikeshedding☆12Jul 7, 2020Updated 5 years ago
- Tutorial accompanying the youtube video .. meant for educational purposes. The Machine Learning done in this project is very simplistic a…☆11Feb 10, 2021Updated 5 years ago
- 'Building a CI/CD pipeline for a Lambda function using AWS CodePipeline' article and source code.☆12Oct 31, 2021Updated 4 years ago
- A tool for the automatic generation of Isabelle/HOL correctness proofs for security protocols.☆18Jun 21, 2015Updated 10 years ago
- Pin files for contextual, codebase-level AI assistance.☆16Jul 11, 2024Updated last year
- Code execution runtime for the STAC Overflow: Map Floodwater from Radar Imagery competition☆12Sep 29, 2021Updated 4 years ago
- Functional lenses on C++ (highly experimental).☆10Aug 13, 2019Updated 6 years ago
- Example fNIRS BIDS dataset☆14Nov 4, 2022Updated 3 years ago
- Image Segmentation with PixelLib and Streamlit☆11May 12, 2021Updated 4 years ago
- ☆13Jul 24, 2021Updated 4 years ago
- Dealing with Flags and Options☆13Sep 10, 2021Updated 4 years ago
- ☆12Jul 12, 2024Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆245Dec 16, 2024Updated last year
- Tool for converting LaTeX-prepared documents to Wordpress-ready HTML☆14Mar 11, 2025Updated 11 months ago
- ☆13Aug 9, 2023Updated 2 years ago