yizhangzzz / transformers-lego
☆18Updated 2 years ago
Alternatives and similar repositories for transformers-lego:
Users that are interested in transformers-lego are comparing it to the libraries listed below
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆58Updated last year
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆10Updated last year
- ☆26Updated last year
- ☆60Updated 3 years ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆36Updated 2 years ago
- ☆29Updated last year
- ☆36Updated 2 years ago
- An ML research codebase built with friends :)☆23Updated 7 months ago
- ModelDiff: A Framework for Comparing Learning Algorithms☆56Updated last year
- ☆25Updated 2 years ago
- ☆34Updated last year
- Expertise modeling for the OpenReview matching system☆35Updated last week
- ☆22Updated 3 years ago
- [ACL 2023]: Training Trajectories of Language Models Across Scales https://arxiv.org/pdf/2212.09803.pdf☆23Updated last year
- ☆35Updated last year
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆15Updated 2 years ago
- Few-shot Learning with Auxiliary Data☆27Updated last year
- ☆52Updated 6 months ago
- Official implementation of "Multi-scale Feature Learning Dynamics: Insights for Double Descent".☆16Updated 2 years ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"☆19Updated last year
- Simple Scalable Discrete Diffusion for text in PyTorch☆33Updated 6 months ago
- SGD with large step sizes learns sparse features [ICML 2023]☆32Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆105Updated last year
- ☆17Updated 2 years ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆44Updated 2 years ago
- Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks☆10Updated 9 months ago
- Minimum Description Length probing for neural network representations☆19Updated 2 months ago
- ZeroC is a neuro-symbolic method that trained with elementary visual concepts and relations, can zero-shot recognize and acquire more com…☆31Updated last year