ayyucekizrak / Mechanistic-InterpretabilityLinks
Mechanistic Interpretability in Transformers: This repository explores advanced techniques like Induction Head Detection and QK Circuit Analysis to uncover the inner workings of transformer-based models.
☆27Updated 11 months ago
Alternatives and similar repositories for Mechanistic-Interpretability
Users that are interested in Mechanistic-Interpretability are comparing it to the libraries listed below
Sorting:
- A curated list of Turkish AI models, datasets, papers☆43Updated this week
- Dilbilim kurallarını temel alarak çok dilli metinleri işlemek ve anlam bütünlüğünü korumak için gelişmiş bir tokenizer altyapısı geliştir…☆15Updated last week
- This repo contains Lyra AI's work in the E-Commerce Hackathon organized by Trendyol and Teknofest.☆13Updated 9 months ago
- ☆21Updated 5 months ago
- Turkish LM Tuner☆84Updated 9 months ago
- ☆53Updated 9 months ago
- A practical, notebook-based AI and data science handbook for learners and researchers. Built with Python & Jupyter. (Türkçesi: Python ve …☆14Updated last week
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆48Updated 11 months ago
- Bundled functions and classes for torch-based machine learning projects.☆14Updated 9 months ago
- ☆335Updated last week
- Persona Vectors: Monitoring and Controlling Character Traits in Language Models☆200Updated 3 weeks ago
- ☆187Updated 9 months ago
- Curated list of open access books☆44Updated 5 months ago
- LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language☆18Updated 8 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆207Updated this week
- Repo for Turkish Wiki NER dataset.☆11Updated 2 years ago
- ☆16Updated 2 years ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆327Updated 9 months ago
- ☆184Updated last month
- Improving Steering Vectors by Targeting Sparse Autoencoder Features☆24Updated 9 months ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆174Updated 5 months ago
- ☆163Updated 9 months ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆122Updated 6 months ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆24Updated 6 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆229Updated last month
- Tools for optimizing steering vectors in LLMs.☆11Updated 4 months ago
- Mastering Large Language Models: Build Your Own LLM from Scratch☆124Updated 2 months ago
- Kod, matematik, çoklu-çerçeveler ve tartışmalar içeren etkileşimli derin öğrenme kitabı. 55 ülkede Stanford, MIT, Harvard, and Cambridge …☆121Updated 2 years ago
- Open source interpretability artefacts for R1.☆158Updated 4 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆77Updated 9 months ago