shizhediao / awesome-transformersLinks

A curated list of resources dedicated to Transformers.

☆8

Alternatives and similar repositories for awesome-transformers

Users that are interested in awesome-transformers are comparing it to the libraries listed below

Sorting:

p-lambda / incontext-learning
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…
☆106Updated last year
dtsip / in-context-learning
☆232Updated last year
lee-ny / teaching_arithmetic
☆83Updated last year
srush / do-we-need-attention
☆166Updated last year
aadityasingh / icl-dynamics
☆19Updated 2 months ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆98Updated last month
McGill-NLP / length-generalization
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
☆136Updated last year
KindXiaoming / Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
☆58Updated 2 years ago
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆190Updated last year
HazyResearch / skill-it
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆46Updated last year
berlino / seq_icl
☆53Updated last year
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆161Updated last year
google-deepmind / neural_networks_chomsky_hierarchy
Neural Networks and the Chomsky Hierarchy
☆205Updated last year
mechanistic-interpretability-grokking / progress-measures-paper
☆67Updated 2 years ago
neelnanda-io / 1L-Sparse-Autoencoder
☆121Updated last year
krandiash / quinine
A library to create and manage configuration files, especially for machine learning projects.
☆78Updated 3 years ago
EdinburghNLP / torch-adaptive-imle
☆35Updated 6 months ago
satwik77 / Transformer-Formal-Languages
EMNLP 2020: On the Ability and Limitations of Transformers to Recognize Formal Languages
☆24Updated 4 years ago
princeton-nlp / LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
☆75Updated last year
explanare / ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆48Updated 8 months ago
GFNOrg / gfn-lm-tuning
☆180Updated last year
AnanyaKumar / transfer_learning
Framework code with wandb, checkpointing, logging, configs, experimental protocols. Useful for fine-tuning models or training from scratc…
☆150Updated 2 years ago
facebookresearch / mega
Sequence modeling with Mega.
☆296Updated 2 years ago
zlin7 / UQ-NLG
☆93Updated 11 months ago
pomonam / kronfluence
Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature
☆156Updated this week
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆257Updated last year
google-research / jax-influence
☆60Updated 3 years ago
mansheej / icl-task-diversity
Code for the paper "Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression"
☆21Updated 2 years ago
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆80Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year