yizhangzzz / transformers-legoLinks

☆18

Alternatives and similar repositories for transformers-lego

Users that are interested in transformers-lego are comparing it to the libraries listed below

Sorting:

ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated last year
remilepriol / causal-adaptation-speed
Investigate the speed of adaptation of structural causal models
☆15Updated 4 years ago
ekinakyurek / google-research
Google Research
☆46Updated 2 years ago
ethz-spylab / superhuman-ai-consistency
☆29Updated 2 years ago
EleutherAI / polyapprox
Closed-form polynomial approximations to neural networks
☆13Updated 5 months ago
ssokota / mec
Code for minimum-entropy coupling.
☆32Updated last year
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆18Updated 5 months ago
IBM / ColPret
Efficient Scaling laws and collaborative pretraining.
☆16Updated 5 months ago
rovle / gpt3-in-context-fitting
Experiments on GPT-3's ability to fit numerical models in-context.
☆14Updated 2 years ago
brendel-group / compositional-ood-generalization
Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)
☆11Updated last year
UKPLab / on-emergence
Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning
☆33Updated 6 months ago
jmtomczak / git_flow
General Invertible Transformations for Flow-based Generative Models
☆18Updated 4 years ago
AndyShih12 / LongHorizonTemperatureScaling
PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023
☆20Updated 2 years ago
rahul13ramesh / compositional_capabilities
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
☆10Updated last year
konstantinosKokos / ape
🧮 Algebraic Positional Encodings.
☆16Updated 6 months ago
aks2203 / easy-to-hard
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"
☆59Updated 3 years ago
EleutherAI / features-across-time
Understanding how features learned by neural networks evolve throughout training
☆36Updated 8 months ago
stanfordmlgroup / disentanglement
Official repository for our ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology
☆36Updated 4 years ago
stanislavfort / dissect-git-re-basin
Replicating and dissecting the git-re-basin project in one-click-replication Colabs
☆36Updated 2 years ago
lxuechen / ml-swissknife
An ML research codebase built with friends :)
☆24Updated 10 months ago
SuReLI / NeurOps
Implementations of growing and pruning in neural networks
☆22Updated last year
borjanG / 2023-transformers
Codes for the paper The emergence of clusters in self-attention dynamics.
☆16Updated last year
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆63Updated last year
shoaibahmed / metadata_archaeology
Official code for the paper: "Metadata Archaeology"
☆19Updated 2 years ago
SamsungSAILMontreal / nino
Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]
☆19Updated last month
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆30Updated 2 years ago
dylandoblar / noether-networks
Meta-learning inductive biases in the form of useful conserved quantities.
☆37Updated 2 years ago
ethansmith2000 / MazeSolver
☆11Updated last year
LCS2-IIITD / TransEvolve
☆11Updated 3 years ago