ayyucekizrak / Mechanistic-InterpretabilityView external linksLinks
Mechanistic Interpretability in Transformers: This repository explores advanced techniques like Induction Head Detection and QK Circuit Analysis to uncover the inner workings of transformer-based models.
☆32Sep 27, 2024Updated last year
Alternatives and similar repositories for Mechanistic-Interpretability
Users that are interested in Mechanistic-Interpretability are comparing it to the libraries listed below
Sorting:
- A tiny easily hackable implementation of a feature dashboard.☆15Oct 21, 2025Updated 3 months ago
- Sakarya Üniversitesi Bilgisayar Mühendisliği Ders İçerikleri ve Çıkmış Soruları☆52Jan 25, 2026Updated 3 weeks ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆29Feb 6, 2026Updated last week
- 🧠 Starter templates for doing interpretability research☆76Jul 16, 2023Updated 2 years ago
- FeatureAlignment = Alignment + Mechanistic Interpretability☆34Mar 8, 2025Updated 11 months ago
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆14Jan 1, 2025Updated last year
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- my profile readme☆14Updated this week
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆10Feb 13, 2024Updated 2 years ago
- ☆12Jul 8, 2024Updated last year
- en temel anlamıyla yazılım öğrenme sürecini anlamaya ve anlatmaya çalışmak☆12Dec 3, 2020Updated 5 years ago
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- Reference implementation of Thin and Deep Gaussian Processes (NeurIPS 2023)☆13Nov 25, 2024Updated last year
- Code release for "Idiosyncrasies in Large Language Models"☆52Jul 21, 2025Updated 6 months ago
- ☆14May 21, 2024Updated last year
- A Java-based framework for combinatorial test input generation, fault characterization and automated test execution.☆11Jan 22, 2024Updated 2 years ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 6 months ago
- 1st Place Team Crane: @aswinkumar1999 @rathull @kyolebu☆29Sep 8, 2025Updated 5 months ago
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- Unofficial implementation of "Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle"☆13Jul 3, 2024Updated last year
- Math evaluations of llama models.☆10Jan 3, 2024Updated 2 years ago
- ☆12Jun 15, 2023Updated 2 years ago
- Unveiling and Mitigating Bias in Mental Health Analysis with Large Language Models☆12Jun 21, 2024Updated last year
- web-crawling (with AngleSharp)☆12May 26, 2025Updated 8 months ago
- The course work repo for UoSurrey EEEM071 (2023 Spring)☆11May 9, 2023Updated 2 years ago
- sealos deck☆11Mar 30, 2024Updated last year
- Complete football/soccer datalake with 93000+ players from Transfermarkt. Includes player profiles, performance statistics, market values…☆40Oct 18, 2025Updated 3 months ago
- A Benchmark for Multi-Stage Legal Case Documents Generation☆14Feb 24, 2025Updated 11 months ago
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 6 months ago
- ☆13Aug 7, 2024Updated last year
- Ilya Sutskever 推荐的30篇Deep learning 必读论文 (中英文对照翻译版)☆13Dec 18, 2024Updated last year
- Fake News Detection-Naive Bayes Model☆12Apr 23, 2020Updated 5 years ago
- ☆11Dec 15, 2025Updated 2 months ago
- Python package for compressing floating-point PyTorch tensors☆13Jul 22, 2024Updated last year
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆14May 28, 2025Updated 8 months ago
- Source code for the module "Advanced Statistics" 📊☆10Feb 25, 2019Updated 6 years ago
- ☆15Apr 15, 2024Updated last year
- End to End Machine Learning Pipeline with scikit learn☆12Mar 10, 2021Updated 4 years ago