tdooms / bilinear-decompositionLinks

Official repo for the paper "Weight-based Decomposition: A Case for Bilinear MLPs"

☆23

Alternatives and similar repositories for bilinear-decomposition

Users that are interested in bilinear-decomposition are comparing it to the libraries listed below

Sorting:

bartbussmann / BatchTopK
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
☆50Updated 2 months ago
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆81Updated 10 months ago
ApolloResearch / e2e_sae
Sparse Autoencoder Training Library
☆55Updated 5 months ago
noranta4 / ASIF
Personal implementation of ASIF by Antonio Norelli
☆26Updated last year
mcleish7 / gemstone-scaling-laws
Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)
☆29Updated 3 weeks ago
formll / resolving-scaling-law-discrepancies
☆20Updated last year
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆64Updated last year
bartbussmann / matryoshka_sae
☆47Updated 9 months ago
jonhue / activeft
PyTorch library for Active Fine-Tuning
☆93Updated 3 weeks ago
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆30Updated last year
dmis-lab / Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
☆73Updated 3 months ago
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆84Updated 11 months ago
tml-epfl / why-weight-decay
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
☆68Updated last year
berlino / seq_icl
☆53Updated last year
adamkarvonen / SAE_BoardGameEval
☆23Updated 8 months ago
stanislavfort / dissect-git-re-basin
Replicating and dissecting the git-re-basin project in one-click-replication Colabs
☆35Updated 3 years ago
koayon / atp_star
PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)
☆20Updated 9 months ago
katiekang1998 / reasoning_generalization
☆33Updated 9 months ago
KihoPark / linear_rep_geometry
☆106Updated 8 months ago
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆40Updated 2 years ago
Nix07 / finetuning
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…
☆28Updated last year
nrimsky / InfluenceFunctions
Implementation of Influence Function approximations for differently sized ML models, using PyTorch
☆15Updated 2 years ago
tim-lawson / mlsae
Multi-Layer Sparse Autoencoders (ICLR 2025)
☆26Updated 8 months ago
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆52Updated last year
ChenWu98 / algorithmic-creativity
[ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
☆72Updated 4 months ago
IBM / ColPret
Efficient Scaling laws and collaborative pretraining.
☆18Updated last month
igul222 / plaid
☆107Updated 2 years ago
ckkissane / crosscoder-model-diff-replication
Open source replication of Anthropic's Crosscoders for Model Diffing
☆59Updated 11 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
fal-ai-community / nano-mdm
Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun
☆56Updated 7 months ago