KhoomeiK / complexity-scalingLinks
gzip Predicts Data-dependent Scaling Laws
☆35Updated last year
Alternatives and similar repositories for complexity-scaling
Users that are interested in complexity-scaling are comparing it to the libraries listed below
Sorting:
- ☆27Updated 10 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆62Updated 7 months ago
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- ☆22Updated last year
- ☆60Updated last year
- Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)☆23Updated 6 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆116Updated 5 months ago
- ☆28Updated last year
- ☆52Updated last year
- ☆131Updated 2 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆75Updated 6 months ago
- Universal Neurons in GPT2 Language Models☆29Updated last year
- ☆51Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated last week
- ☆68Updated 9 months ago
- Sparse Autoencoder Training Library☆52Updated last month
- ☆38Updated 10 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆128Updated 3 weeks ago
- Code associated to papers on superposition (in ML interpretability)☆28Updated 2 years ago
- Experiments for efforts to train a new and improved t5☆77Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 8 months ago
- Triton Implementation of HyperAttention Algorithm☆48Updated last year
- ☆96Updated 3 months ago
- ☆26Updated 2 years ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated 11 months ago
- ☆53Updated 8 months ago
- Functional Benchmarks and the Reasoning Gap☆86Updated 8 months ago
- LLM training in simple, raw C/CUDA☆14Updated 6 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated 11 months ago
- ☆79Updated 9 months ago