jsbaan / transformer-from-scratchLinks

Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.

☆254

Alternatives and similar repositories for transformer-from-scratch

Users that are interested in transformer-from-scratch are comparing it to the libraries listed below

Sorting:

bkitano / llama-from-scratch
Llama from scratch, or How to implement a paper without crying
☆574Updated last year
tintn / vision-transformer-from-scratch
A Simplified PyTorch Implementation of Vision Transformer (ViT)
☆196Updated last year
Khaliladib11 / Transformer-from-scratch
I will build Transformer from scratch
☆78Updated 2 weeks ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆487Updated last year
sgrvinod / a-PyTorch-Tutorial-to-Transformers
Attention Is All You Need | a PyTorch Tutorial to Transformers
☆332Updated last year
coaxsoft / pytorch_bert
Tutorial for how to build BERT from scratch
☆97Updated last year
hkproj / pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
☆345Updated last year
hkproj / pytorch-lora
LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch
☆112Updated 2 years ago
shreyansh26 / Annotated-ML-Papers
Annotations of the interesting ML papers I read
☆245Updated last week
lucasdelimanogueira / PyNorch
Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)
☆151Updated last year
EurekaLabsAI / tensor
The Tensor (or Array)
☆441Updated 11 months ago
hkproj / transformer-from-scratch-notes
Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)
☆299Updated 2 years ago
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆164Updated last month
EurekaLabsAI / mlp
The Multilayer Perceptron Language Model
☆557Updated 11 months ago
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆256Updated last year
neubig / anlp-code
☆181Updated last year
ajhalthor / Transformer-Neural-Network
Code Transformer neural network components piece by piece
☆358Updated 2 years ago
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated 10 months ago
eduardoleao052 / Autograd-from-scratch
Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.
☆117Updated last year
srush / Transformer-Puzzles
Puzzles for exploring transformers
☆356Updated 2 years ago
ayulockin / neurips-llm-efficiency-challenge
Starter pack for NeurIPS LLM Efficiency Challenge 2023.
☆125Updated last year
dpressel / mint
MinT: Minimal Transformer Library and Tutorials
☆256Updated 3 years ago
fkodom / transformer-from-scratch
Code implementation from my blog post: https://fkodom.substack.com/p/transformers-from-scratch-in-pytorch
☆95Updated 2 years ago
neubig / minllama-assignment
☆90Updated 10 months ago
pacman100 / LLM-Workshop
LLM Workshop by Sourab Mangrulkar
☆388Updated last year
facebookresearch / optimizers
For optimization algorithm research and development.
☆524Updated this week
CG80499 / KAN-GPT-2
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆121Updated last year
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆164Updated 4 months ago
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆809Updated last week
hkproj / pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
☆76Updated last year