goldblum / free-lunchLinks
Implementation of experiments from The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
☆17Updated 2 years ago
Alternatives and similar repositories for free-lunch
Users that are interested in free-lunch are comparing it to the libraries listed below
Sorting:
- Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…☆19Updated last year
- ☆53Updated last year
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆29Updated 2 months ago
- ☆24Updated 7 months ago
- nanoGPT-like codebase for LLM training☆110Updated 3 weeks ago
- Universal Neurons in GPT2 Language Models☆31Updated last year
- Omnigrok: Grokking Beyond Algorithmic Data☆62Updated 2 years ago
- ☆31Updated 8 months ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆40Updated 2 years ago
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆81Updated 2 years ago
- Sparse Autoencoder Training Library☆55Updated 6 months ago
- Code associated to papers on superposition (in ML interpretability)☆33Updated 3 years ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆31Updated 2 years ago
- ☆34Updated 2 years ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆79Updated 3 years ago
- ☆45Updated 2 years ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆85Updated last year
- ☆25Updated last year
- ☆72Updated 11 months ago
- ☆52Updated 8 months ago
- ☆110Updated 9 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆84Updated last year
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆15Updated 2 years ago
- ☆53Updated last year
- ☆63Updated 3 years ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆68Updated last year
- [NeurIPS 2023] Learning Transformer Programs☆162Updated last year
- ☆20Updated 3 weeks ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆56Updated 3 weeks ago