alan-turing-institute / robots-in-disguiseLinks
Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.
☆33Updated 6 months ago
Alternatives and similar repositories for robots-in-disguise
Users that are interested in robots-in-disguise are comparing it to the libraries listed below
Sorting:
- 🧠 Starter templates for doing interpretability research☆74Updated 2 years ago
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆309Updated 2 months ago
- 👋 Overcomplete is a Vision-based SAE Toolbox☆82Updated last month
- we got you bro☆36Updated last year
- ☆83Updated last year
- ☆68Updated 2 years ago
- List of ML conferences with important dates and accepted paper list☆152Updated last week
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆129Updated 3 years ago
- MetaQuantus is an XAI performance tool to identify reliable evaluation metrics☆39Updated last year
- Tools for studying developmental interpretability in neural networks.☆103Updated 3 months ago
- Sparse Autoencoder for Mechanistic Interpretability☆267Updated last year
- Official repository for CMU Machine Learning Department's 10721: "Philosophical Foundations of Machine Intelligence".☆262Updated 2 years ago
- LENS Project☆50Updated last year
- OpenXAI : Towards a Transparent Evaluation of Model Explanations☆247Updated last year
- Package for extracting and mapping the results of every single tensor operation in a PyTorch model in one line of code.☆615Updated 6 months ago
- ☆81Updated 7 months ago
- Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.☆85Updated this week
- This repository collects all relevant resources about interpretability in LLMs☆372Updated 10 months ago
- Causal Responsibility EXplanations for Image Classifiers and Tabular Data☆37Updated last month
- Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.☆233Updated last month
- ☆27Updated 2 years ago
- ☆345Updated last month
- The boundary of neural network trainability is fractal☆216Updated last year
- 👋 Aligning Human & Machine Vision using explainability☆52Updated 2 years ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆227Updated last month
- ☆122Updated 3 years ago
- Croissant is a high-level format for machine learning datasets that brings together four rich layers.☆718Updated last week
- An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization☆131Updated last year
- Layer-wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]☆189Updated 2 months ago
- Exca - Execution and caching tool for python☆104Updated this week