goldblum / free-lunchLinks

Implementation of experiments from The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning

☆17

Alternatives and similar repositories for free-lunch

Users that are interested in free-lunch are comparing it to the libraries listed below

Sorting:

DeqingFu / transformers-icl-second-order
Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…
☆19Updated last year
berlino / seq_icl
☆53Updated last year
mcleish7 / gemstone-scaling-laws
Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)
☆29Updated 2 months ago
aadityasingh / icl-dynamics
☆24Updated 7 months ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆110Updated 3 weeks ago
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆31Updated last year
KindXiaoming / Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
☆62Updated 2 years ago
Silent-Zebra / twisted-smc-lm
☆31Updated 8 months ago
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆40Updated 2 years ago
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆64Updated last year
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆81Updated 2 years ago
ApolloResearch / e2e_sae
Sparse Autoencoder Training Library
☆55Updated 6 months ago
wattenberg / superposition
Code associated to papers on superposition (in ML interpretability)
☆33Updated 3 years ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆31Updated 2 years ago
allenbai01 / transformers-as-statisticians
☆34Updated 2 years ago
Sea-Snell / grokking
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆79Updated 3 years ago
abhishekpanigrahi1996 / transformer_in_transformer
☆45Updated 2 years ago
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆85Updated last year
noanabeshima / matryoshka-saes
☆25Updated last year
locuslab / edge-of-stability
☆72Updated 11 months ago
probabilistic-inference-scaling / probabilistic-inference-scaling
☆52Updated 8 months ago
KihoPark / linear_rep_geometry
☆110Updated 9 months ago
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆84Updated last year
nrimsky / InfluenceFunctions
Implementation of Influence Function approximations for differently sized ML models, using PyTorch
☆15Updated 2 years ago
AllanYangZhou / universal_neural_functional
☆53Updated last year
google-research / jax-influence
☆63Updated 3 years ago
tml-epfl / why-weight-decay
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
☆68Updated last year
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆162Updated last year
formll / resolving-scaling-law-discrepancies
☆20Updated 3 weeks ago
explanare / ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆56Updated 3 weeks ago