yizhangzzz / transformers-legoLinks
☆18Updated 2 years ago
Alternatives and similar repositories for transformers-lego
Users that are interested in transformers-lego are comparing it to the libraries listed below
Sorting:
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated last year
- Investigate the speed of adaptation of structural causal models☆15Updated 4 years ago
- Google Research☆46Updated 2 years ago
- ☆29Updated 2 years ago
- Closed-form polynomial approximations to neural networks☆13Updated 5 months ago
- Code for minimum-entropy coupling.☆32Updated last year
- Minimum Description Length probing for neural network representations☆18Updated 5 months ago
- Efficient Scaling laws and collaborative pretraining.☆16Updated 5 months ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 2 years ago
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆11Updated last year
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Updated 6 months ago
- General Invertible Transformations for Flow-based Generative Models☆18Updated 4 years ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated 2 years ago
- Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks☆10Updated last year
- 🧮 Algebraic Positional Encodings.☆16Updated 6 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆59Updated 3 years ago
- Understanding how features learned by neural networks evolve throughout training☆36Updated 8 months ago
- Official repository for our ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology☆36Updated 4 years ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- An ML research codebase built with friends :)☆24Updated 10 months ago
- Implementations of growing and pruning in neural networks☆22Updated last year
- Codes for the paper The emergence of clusters in self-attention dynamics.☆16Updated last year
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- Official code for the paper: "Metadata Archaeology"☆19Updated 2 years ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆19Updated last month
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated last year
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated 2 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆11Updated last year
- ☆11Updated 3 years ago