gladia-research-group / explanatory-learningLinks

This is the official repository for "Explanatory Learning: Beyond Empiricism in Neural Networks".

☆14

Alternatives and similar repositories for explanatory-learning

Users that are interested in explanatory-learning are comparing it to the libraries listed below

Sorting:

Flegyas / latentis
A Python package for analyzing and transforming neural latent spaces.
☆52Updated 3 months ago
google-deepmind / neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
☆210Updated last year
TomFrederik / unseal
Mechanistic Interpretability for Transformer Models
☆53Updated 3 years ago
EleutherAI / concept-erasure
Erasing concepts from neural representations with provable guarantees
☆238Updated 9 months ago
april-tools / gekcs
How to Turn Your Knowledge Graph Embeddings into Generative Models
☆53Updated last year
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆129Updated 3 years ago
genlm / llamppl
Probabilistic programming with large language models
☆141Updated 3 months ago
probcomp / LLaMPPL
A domain-specific probabilistic programming language for modeling and inference with language models
☆136Updated 6 months ago
KindXiaoming / BIMT
Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.
☆173Updated 2 years ago
collin-burns / discovering_latent_knowledge
☆279Updated last year
tommasomncttn / mergenetic
Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).
☆90Updated 2 months ago
samacqua / LARC
Language-annotated Abstraction and Reasoning Corpus
☆93Updated 2 years ago
bilal-chughtai / rep-theory-mech-interp
☆27Updated 2 years ago
ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated last year
apartresearch / interpretability-starter
🧠 Starter templates for doing interpretability research
☆74Updated 2 years ago
ApolloResearch / apd
Attribution-based Parameter Decomposition
☆31Updated 4 months ago
EleMisi / VAEL
Codebase for VAEL: Bridging Variational Autoencoders and Probabilistic Logic Programming
☆21Updated 2 years ago
nalexai / hyperlib
Library that contains implementations of machine learning components in the hyperbolic space
☆142Updated last year
tech-srl / RASP
An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"
☆320Updated last year
KihoPark / LLM_Categorical_Hierarchical_Representations
☆110Updated 8 months ago
EleutherAI / elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
☆211Updated this week
timaeus-research / devinterp
Tools for studying developmental interpretability in neural networks.
☆111Updated 4 months ago
neelnanda-io / 1L-Sparse-Autoencoder
☆128Updated 2 years ago
KihoPark / linear_rep_geometry
☆107Updated 8 months ago
jbloomAus / SAEDashboard
☆74Updated 2 weeks ago
EleutherAI / project-menu
See the issue board for the current status of active and prospective projects!
☆65Updated 3 years ago
noranta4 / ASIF
Personal implementation of ASIF by Antonio Norelli
☆26Updated last year
ArthurConmy / Automatic-Circuit-Discovery
☆247Updated last year
google-deepmind / transformer_grammars
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale, TACL (2022)
☆130Updated 4 months ago
alignedai / HappyFaces
The Happy Faces Benchmark
☆15Updated 2 years ago