RandallBalestriero / SplineLLMLinks

☆16

Alternatives and similar repositories for SplineLLM

Users that are interested in SplineLLM are comparing it to the libraries listed below

Sorting:

taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆64Updated 2 years ago
ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated 2 years ago
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
EleutherAI / features-across-time
Understanding how features learned by neural networks evolve throughout training
☆41Updated last year
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆40Updated 2 years ago
shauli-ravfogel / rlace-icml
☆36Updated 3 years ago
brendel-group / compositional-ood-generalization
Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)
☆13Updated 2 years ago
bhoov / energy-transformer-jax
The Energy Transformer block, in JAX
☆63Updated 2 years ago
abhishekpanigrahi1996 / transformer_in_transformer
☆46Updated 2 years ago
keyonvafa / world-model-evaluation
☆77Updated last year
ml-jku / quam
Quantification of Uncertainty with Adversarial Models
☆29Updated 2 years ago
epfl-dlab / understanding-decoding
The data and the PyTorch implementation for the models and experiments in the paper "Language Model Decoding as Likelihood–Utility Alignm…
☆14Updated 2 years ago
lucidrains / esbn-transformer
An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
☆16Updated 4 years ago
ekinakyurek / google-research
Google Research
☆46Updated 3 years ago
aks2203 / deep-thinking
A centralized place for deep thinking code and experiments
☆90Updated 2 years ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆20Updated last year
jxbz / entropix
📰 Computing the information content of trained neural networks
☆22Updated 4 years ago
feradauto / MoralCoT
Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
☆38Updated 2 years ago
KihoPark / LLM_Categorical_Hierarchical_Representations
☆112Updated last year
ema-marconato / glancenet
Updated code base for GlanceNets: Interpretable, Leak-proof Concept-based models
☆25Updated 2 years ago
codekansas / rwkv
RWKV model implementation
☆37Updated 2 years ago
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆81Updated 2 years ago
dylandoblar / noether-networks
Meta-learning inductive biases in the form of useful conserved quantities.
☆39Updated 3 years ago
shoaibahmed / metadata_archaeology
Official code for the paper: "Metadata Archaeology"
☆19Updated 2 years ago
etimush / ARC_NCA
Repo for solving arc problems with an Neural Cellular Automata
☆23Updated 8 months ago
guy-dar / embedding-space
☆57Updated 2 years ago
bilal-chughtai / rep-theory-mech-interp
☆28Updated 2 years ago
dangxingyu / rnn-icrag
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Updated last year
jxiw / BiGS
Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …
☆116Updated last year
SamsungSAILMontreal / PAPA
Repository for the PopulAtion Parameter Averaging (PAPA) paper
☆30Updated last year