ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆58Updated last year
Alternatives and similar repositories for broken_neural_scaling_laws:
Users that are interested in broken_neural_scaling_laws are comparing it to the libraries listed below
- ☆61Updated 2 years ago
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆67Updated 2 years ago
- ☆45Updated last year
- ☆52Updated 5 months ago
- Mechanistic Interpretability for Transformer Models☆50Updated 2 years ago
- ☆26Updated last year
- ☆24Updated 2 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆33Updated 3 years ago
- ☆51Updated 10 months ago
- ☆81Updated 7 months ago
- Code associated to papers on superposition (in ML interpretability)☆28Updated 2 years ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆78Updated 7 months ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated last year
- Sparse and discrete interpretability tool for neural networks☆59Updated last year
- Blog post☆17Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆104Updated last year
- Universal Neurons in GPT2 Language Models☆27Updated 9 months ago
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆15Updated last year
- ☆38Updated 4 years ago
- Google Research☆46Updated 2 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆74Updated last year
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- ☆34Updated last year
- Silly twitter torch implementations.☆46Updated 2 years ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆26Updated 11 months ago
- Code for minimum-entropy coupling.☆31Updated 8 months ago
- ☆47Updated last year
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆99Updated 2 years ago
- Sparse Autoencoder Training Library☆43Updated 4 months ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆25Updated last year