hkproj/pytorch-transformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hkproj/pytorch-transformer)

hkproj / pytorch-transformer

Attention is all you need implementation

☆1,255

Alternatives and similar repositories for pytorch-transformer

Users that are interested in pytorch-transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hkproj / pytorch-llama
View on GitHub
LLaMA 2 implemented from scratch in PyTorch
☆375Sep 25, 2023Updated 2 years ago
hkproj / pytorch-stable-diffusion
View on GitHub
Stable Diffusion implemented from scratch in PyTorch
☆1,073Oct 22, 2024Updated last year
hkproj / transformer-from-scratch-notes
View on GitHub
Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)
☆371May 28, 2023Updated 3 years ago
hkproj / pytorch-paligemma
View on GitHub
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw
☆625Dec 6, 2024Updated last year
hkproj / pytorch-lora
View on GitHub
LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch
☆128Jul 24, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
hkproj / lazy-ml
View on GitHub
ML algorithms implementations that are good for learning the underlying principles
☆28Dec 7, 2024Updated last year
hkproj / pytorch-transformer-distributed
View on GitHub
Distributed training (multi-node) of a Transformer model
☆98Apr 10, 2024Updated 2 years ago
hkproj / bert-from-scratch
View on GitHub
BERT explained from scratch
☆18Oct 26, 2023Updated 2 years ago
hkproj / rlhf-ppo
View on GitHub
Notes and commented code for RLHF (PPO)
☆136Feb 27, 2024Updated 2 years ago
hkproj / triton-flash-attention
View on GitHub
☆257Jan 2, 2025Updated last year
hyunwoongko / transformer
View on GitHub
Transformer: PyTorch Implementation of "Attention Is All You Need"
☆4,626Jul 15, 2025Updated last year
hkproj / pytorch-llama-notes
View on GitHub
Notes about LLaMA 2 model
☆75Aug 30, 2023Updated 2 years ago
hkproj / mistral-llm-notes
View on GitHub
Notes on the Mistral AI model
☆21Dec 27, 2023Updated 2 years ago
knotgrass / How-Transformers-Work
View on GitHub
🧠 A study guide to learn about Transformers
☆12Jan 11, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
hkproj / mistral-src-commented
View on GitHub
Reference implementation of Mistral AI 7B v0.1 model.
☆28Dec 25, 2023Updated 2 years ago
coaxsoft / pytorch_bert
View on GitHub
Tutorial for how to build BERT from scratch
☆101May 22, 2024Updated 2 years ago
hkproj / quantization-notes
View on GitHub
Notes on quantization in neural networks
☆131Dec 14, 2023Updated 2 years ago
hkproj / retrieval-augmented-generation-notes
View on GitHub
Slides for "Retrieval Augmented Generation" video
☆27Nov 27, 2023Updated 2 years ago
karpathy / build-nanogpt
View on GitHub
Video+code lecture on building nanoGPT from scratch
☆5,395Aug 13, 2024Updated last year
karpathy / nanoGPT
View on GitHub
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆61,669Nov 12, 2025Updated 8 months ago
hkproj / kan-notes
View on GitHub
☆19May 11, 2024Updated 2 years ago
rasbt / LLMs-from-scratch
View on GitHub
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
☆99,963Updated this week
wantbook-book / SeRL
View on GitHub
SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
☆24Jan 24, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AliHaiderAhmad001 / BERT-from-Scratch-with-PyTorch
View on GitHub
Implementation of BERT-based Language Models
☆28Mar 12, 2026Updated 4 months ago
karpathy / micrograd
View on GitHub
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
☆16,902Aug 8, 2024Updated last year
mrdbourke / pytorch-deep-learning
View on GitHub
Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
☆18,572Feb 11, 2026Updated 5 months ago
tspeterkim / paged-attention-minimal
View on GitHub
a minimal cache manager for PagedAttention, on top of llama3.
☆149Aug 26, 2024Updated last year
explainingai-code / StableDiffusion-PyTorch
View on GitHub
This repo implements a Stable Diffusion model in PyTorch with all the essential components.
☆254Nov 24, 2024Updated last year
dtransposed / code_videos
View on GitHub
Code for any videos
☆29Feb 17, 2024Updated 2 years ago
1y33 / 100Days
View on GitHub
GPU Kernels
☆225Apr 27, 2025Updated last year
karpathy / ng-video-lecture
View on GitHub
☆4,884Jan 31, 2024Updated 2 years ago
Lwasinam / voicera
View on GitHub
☆19Sep 9, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,568Updated this week
BhavyaGoyal777 / IMPLEMENTING-RESEARCH-PAPERS
View on GitHub
Basically a repo containing architectures/algorithms/papers from scratch in pytorch
☆30Feb 11, 2026Updated 5 months ago
mlabonne / llm-course
View on GitHub
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
☆81,303Feb 5, 2026Updated 5 months ago
kjsman / stable-diffusion-pytorch
View on GitHub
Yet another PyTorch implementation of Stable Diffusion (probably easy to read)
☆593Mar 4, 2024Updated 2 years ago
HandsOnLLM / Hands-On-Large-Language-Models
View on GitHub
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
☆27,866Apr 24, 2026Updated 3 months ago
karpathy / LLM101n
View on GitHub
LLM101n: Let's build a Storyteller
☆37,507Aug 1, 2024Updated last year
McGill-NLP / nano-aha-moment
View on GitHub
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆625Oct 7, 2025Updated 9 months ago