tranquoctrinh / transformer

This is a PyTorch implementation of the Transformer model in the paper Attention is All You Need

☆27

Alternatives and similar repositories for transformer:

Users that are interested in transformer are comparing it to the libraries listed below

tintn / vision-transformer-from-scratch
A Simplified PyTorch Implementation of Vision Transformer (ViT)
☆154Updated 7 months ago
hkproj / pytorch-lora
LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch
☆95Updated last year
jakariaemon / CNN-KAN
A modified CNN architecture using Kolmogorov-Arnold Networks
☆70Updated 8 months ago
evintunador / minLlama3
a simplified version of Meta's Llama 3 model to be used for learning
☆38Updated 8 months ago
hkproj / pytorch-ddpm
Implementation of the paper "Denoising Diffusion Probabilistic Models" in PyTorch
☆46Updated last year
ChanCheeKean / DataScience
☆78Updated 10 months ago
hkproj / pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
☆50Updated 9 months ago
shreyansh26 / FlashAttention-PyTorch
Implementation of FlashAttention in PyTorch
☆129Updated 2 weeks ago
atakansite / nlp-courses
Natural Language Processing Courses with Resources
☆33Updated 2 months ago
PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆106Updated 3 months ago
jsbaan / transformer-from-scratch
Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.
☆233Updated 9 months ago
sgrvinod / a-PyTorch-Tutorial-to-Transformers
Attention Is All You Need | a PyTorch Tutorial to Transformers
☆285Updated 11 months ago
pg2455 / KAN-Tutorial
Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examples
☆175Updated 3 months ago
hkproj / mamba-notes
Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
☆159Updated last year
tommyip / mamba2-minimal
Minimal Mamba-2 implementation in PyTorch
☆166Updated 7 months ago
maciejbalawejder / Deep-Learning-Collection
☆60Updated 2 years ago
hkproj / pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
☆286Updated last year
Indoxer / LKAN
Variations of Kolmogorov-Arnold Networks
☆112Updated 8 months ago
FrancoisPorcher / awesome-ai-tutorials
The best collection of AI tutorials to make you a boss of Data Science!
☆79Updated last month
angry-kratos / Simple_Llama3_from_scratch
☆32Updated 7 months ago
hkproj / triton-flash-attention
☆110Updated 3 weeks ago
yu-rp / KANbeFair
A More Fair and Comprehensive Comparison between KAN and MLP
☆157Updated 5 months ago
explainingai-code / DDPM-Pytorch
This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch
☆93Updated 2 months ago
chenziwenhaoshuai / Vision-KAN
KAN for Vision Transformer
☆240Updated 3 months ago
AthanasiosDelis / faster-kan
Benchmarking and Testing FastKAN
☆70Updated 8 months ago
givkashi / Awesome-unet-like-transformers
Awesome UNet with Transformer
☆61Updated last year
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆155Updated this week
hoangthangta / FC_KAN
FC-KAN: Function Combinations in Kolmogorov-Arnold Networks
☆26Updated last month
Montinger / Transformer-Workbench
Playground for Transformers
☆47Updated last year