tdooms / bilinear-decomposition
Official repo for the paper "Weight-based Decomposition: A Case for Bilinear MLPs"
☆20Updated 4 months ago
Alternatives and similar repositories for bilinear-decomposition:
Users that are interested in bilinear-decomposition are comparing it to the libraries listed below
- Sparse and discrete interpretability tool for neural networks☆59Updated last year
- Personal implementation of ASIF by Antonio Norelli☆25Updated 10 months ago
- Sparse Autoencoder Training Library☆47Updated 5 months ago
- Simple and scalable tools for data-driven pretraining data selection.☆18Updated last month
- ☆18Updated 8 months ago
- ☆22Updated last month
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated last year
- Implementation of Bitune: Bidirectional Instruction-Tuning☆19Updated 9 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆63Updated 6 months ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆35Updated 2 years ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆44Updated 2 weeks ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆62Updated this week
- Official implementation of "BERTs are Generative In-Context Learners"☆26Updated 2 weeks ago
- ☆26Updated last year
- ☆66Updated 4 months ago
- ☆27Updated last month
- ☆35Updated 2 years ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆24Updated 4 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆73Updated 4 months ago
- ☆33Updated 6 months ago
- ☆52Updated 5 months ago
- ☆34Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆58Updated last year
- ☆31Updated 2 months ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆25Updated last year
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆15Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"☆19Updated last year
- ☆28Updated 8 months ago
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 4 months ago