chengxiang / LinearTransformerLinks

Pytorch code for experiments on Linear Transformers

☆23

Alternatives and similar repositories for LinearTransformer

Users that are interested in LinearTransformer are comparing it to the libraries listed below

Sorting:

locuslab / edge-of-stability
☆71Updated 10 months ago
DeqingFu / transformers-icl-second-order
Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…
☆19Updated 11 months ago
KindXiaoming / Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
☆62Updated 2 years ago
LIONS-EPFL / scion
☆45Updated last week
gortizji / linearized-networks
Source code of "What can linearized neural networks actually say about generalization?
☆20Updated 4 years ago
MaximeRobeyns / bayesian_lora
Bayesian Low-Rank Adaptation for Large Language Models
☆36Updated last year
xu-ji / information-bottleneck
Deep Learning & Information Bottleneck
☆61Updated 2 years ago
allenbai01 / transformers-as-statisticians
☆34Updated 2 years ago
reds-lab / LAVA
This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).
☆51Updated last year
dtsip / in-context-learning
☆240Updated last year
MarlonBecker / MSAM
☆19Updated last year
gortizji / tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".
☆105Updated 2 years ago
fangyuan-ksgk / selective-attention-transformer
Unofficial Implementation of Selective Attention Transformer
☆17Updated 11 months ago
cjyaras / deep-lora-transformers
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)
☆13Updated last year
WeiHuang05 / Awesome_Large_Foundation_Model_Theory
Welcome to the 'In Context Learning Theory' Reading Group
☆30Updated 11 months ago
kwignb / NeuralTangentKernel-Papers
Neural Tangent Kernel Papers
☆118Updated 9 months ago
Leiay / looped_transformer
☆33Updated last year
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆30Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆109Updated 5 months ago
automl / is_mamba_capable_of_icl
☆18Updated last year
bartbussmann / matryoshka_sae
☆49Updated 9 months ago
zyushun / hessian-spectrum
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆64Updated 7 months ago
machine-discovery / deer
Parallelizing non-linear sequential models over the sequence length
☆54Updated 4 months ago
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆80Updated 2 years ago
Silent-Zebra / twisted-smc-lm
☆32Updated 7 months ago
formll / dog
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
☆63Updated 2 years ago
jongharyu / neural-svd
Official PyTorch implementation of NeuralSVD (ICML 2024)
☆20Updated last year
tml-epfl / sharpness-vs-generalization
A modern look at the relationship between sharpness and generalization [ICML 2023]
☆43Updated 2 years ago
pilancilab / Riemannian_Preconditioned_LoRA
source code for paper "Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models"
☆32Updated last year
KihoPark / linear_rep_geometry
☆108Updated 8 months ago