alan-turing-institute / robots-in-disguiseLinks
Information and materials for the Turing's "robots-in-disguise" reading group on fundamental AI research.
☆33Updated 7 months ago
Alternatives and similar repositories for robots-in-disguise
Users that are interested in robots-in-disguise are comparing it to the libraries listed below
Sorting:
- 🧠 Starter templates for doing interpretability research☆74Updated 2 years ago
- ☆82Updated last year
- MetaQuantus is an XAI performance tool to identify reliable evaluation metrics☆39Updated last year
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆319Updated 3 months ago
- ☆69Updated 2 years ago
- Official repository for CMU Machine Learning Department's 10721: "Philosophical Foundations of Machine Intelligence".☆263Updated 2 years ago
- ☆75Updated 2 years ago
- Causal Responsibility EXplanations for Image Classifiers and Tabular Data☆39Updated this week
- we got you bro☆36Updated last year
- LENS Project☆50Updated last year
- Tools for studying developmental interpretability in neural networks.☆112Updated 4 months ago
- This repository collects all relevant resources about interpretability in LLMs☆377Updated last year
- NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆41Updated 8 months ago
- Deep Learning, an Energy Approach☆218Updated 4 months ago
- The M2L school 2022 tutorials☆36Updated 3 years ago
- List of ML conferences with important dates and accepted paper list☆165Updated 3 weeks ago
- Code for Arxiv Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle☆57Updated last year
- PyTorch code corresponding to my blog series on adversarial examples and (confidence-calibrated) adversarial training.☆67Updated 2 years ago
- 👋 Aligning Human & Machine Vision using explainability☆52Updated 2 years ago
- ☆122Updated 3 years ago
- An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization☆136Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆129Updated 3 years ago
- Tools for understanding how transformer predictions are built layer-by-layer☆536Updated 2 months ago
- Sparse Autoencoder for Mechanistic Interpretability☆278Updated last year
- Mechanistic Interpretability Visualizations using React☆297Updated 10 months ago
- A course on imprecise probabilistic machine learning☆80Updated last week
- ☆81Updated 8 months ago
- 👋 Overcomplete is a Vision-based SAE Toolbox☆96Updated 3 months ago
- Package for extracting and mapping the results of every single tensor operation in a PyTorch model in one line of code.☆622Updated last month
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆823Updated 3 weeks ago