jsbaan / transformer-from-scratchLinks
Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.
☆273Updated last year
Alternatives and similar repositories for transformer-from-scratch
Users that are interested in transformer-from-scratch are comparing it to the libraries listed below
Sorting:
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆119Updated 2 years ago
- Llama from scratch, or How to implement a paper without crying☆583Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆365Updated 2 years ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆231Updated last year
- I will build Transformer from scratch☆85Updated 5 months ago
- Attention Is All You Need | a PyTorch Tutorial to Transformers☆362Updated last year
- Tutorial for how to build BERT from scratch☆101Updated last year
- Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)☆332Updated 2 years ago
- Code implementation from my blog post: https://fkodom.substack.com/p/transformers-from-scratch-in-pytorch☆97Updated 2 years ago
- Annotated version of the Mamba paper☆494Updated last year
- MinT: Minimal Transformer Library and Tutorials☆260Updated 3 years ago
- Annotations of the interesting ML papers I read☆271Updated 2 weeks ago
- LoRA and DoRA from Scratch Implementations☆215Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆225Updated 10 months ago
- Distributed training (multi-node) of a Transformer model☆91Updated last year
- Code Transformer neural network components piece by piece☆371Updated 2 years ago
- Fast bare-bones BPE for modern tokenizer training☆175Updated 6 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆114Updated last year
- ☆189Updated 2 years ago
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)☆161Updated last month
- Research projects built on top of Transformers☆110Updated 10 months ago
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆122Updated last year
- Puzzles for exploring transformers☆382Updated 2 years ago
- LLM Workshop by Sourab Mangrulkar☆400Updated last year
- Notes on quantization in neural networks☆114Updated 2 years ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆829Updated 5 months ago
- This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.☆192Updated 4 years ago
- Best practices & guides on how to write distributed pytorch training code☆564Updated 2 months ago
- For optimization algorithm research and development.☆556Updated this week
- A walkthrough of transformer architecture code☆370Updated last year