yizhangzzz / transformers-lego
☆18Updated 2 years ago
Alternatives and similar repositories for transformers-lego:
Users that are interested in transformers-lego are comparing it to the libraries listed below
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated last year
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- ☆20Updated 3 months ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆34Updated last year
- Efficient Scaling laws and collaborative pretraining.☆13Updated 2 months ago
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆9Updated last year
- ☆35Updated last year
- ☆28Updated last year
- SGD with large step sizes learns sparse features [ICML 2023]☆32Updated last year
- ☆29Updated last year
- An ML research codebase built with friends :)☆22Updated 4 months ago
- ModelDiff: A Framework for Comparing Learning Algorithms☆54Updated last year
- Latest Weight Averaging (NeurIPS HITY 2022)☆28Updated last year
- Transformers with doubly stochastic attention☆44Updated 2 years ago
- PyTorch implementation for "Probabilistic Circuits for Variational Inference in Discrete Graphical Models", NeurIPS 2020☆15Updated 3 years ago
- Investigate the speed of adaptation of structural causal models☆16Updated 3 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 2 years ago
- Universal Neurons in GPT2 Language Models☆27Updated 7 months ago
- ☆58Updated 3 years ago
- Minimum Description Length probing for neural network representations☆18Updated last week
- ☆35Updated 2 years ago
- ☆17Updated 3 years ago
- A minimal implementation of a VAE with BinConcrete (relaxed Bernoulli) latent distribution in TensorFlow.☆21Updated 4 years ago
- Code for Accelerated Linearized Laplace Approximation for Bayesian Deep Learning (ELLA, NeurIPS 22')☆16Updated 2 years ago
- Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch☆23Updated 4 years ago
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆100Updated last year
- ☆22Updated 3 years ago
- ☆21Updated last year
- ☆44Updated last year