tranquoctrinh / transformer
This is a PyTorch implementation of the Transformer model in the paper Attention is All You Need
☆27Updated 2 years ago
Alternatives and similar repositories for transformer:
Users that are interested in transformer are comparing it to the libraries listed below
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆154Updated 7 months ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆95Updated last year
- A modified CNN architecture using Kolmogorov-Arnold Networks☆70Updated 8 months ago
- a simplified version of Meta's Llama 3 model to be used for learning☆38Updated 8 months ago
- Implementation of the paper "Denoising Diffusion Probabilistic Models" in PyTorch☆46Updated last year
- ☆78Updated 10 months ago
- Distributed training (multi-node) of a Transformer model☆50Updated 9 months ago
- Implementation of FlashAttention in PyTorch☆129Updated 2 weeks ago
- Natural Language Processing Courses with Resources☆33Updated 2 months ago
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆106Updated 3 months ago
- Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.☆233Updated 9 months ago
- Attention Is All You Need | a PyTorch Tutorial to Transformers☆285Updated 11 months ago
- Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examples☆175Updated 3 months ago
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆159Updated last year
- Minimal Mamba-2 implementation in PyTorch☆166Updated 7 months ago
- ☆60Updated 2 years ago
- LLaMA 2 implemented from scratch in PyTorch☆286Updated last year
- Variations of Kolmogorov-Arnold Networks☆112Updated 8 months ago
- The best collection of AI tutorials to make you a boss of Data Science!☆79Updated last month
- ☆32Updated 7 months ago
- ☆110Updated 3 weeks ago
- A More Fair and Comprehensive Comparison between KAN and MLP☆157Updated 5 months ago
- This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch☆93Updated 2 months ago
- KAN for Vision Transformer☆240Updated 3 months ago
- Benchmarking and Testing FastKAN☆70Updated 8 months ago
- Awesome UNet with Transformer☆61Updated last year
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆155Updated this week
- FC-KAN: Function Combinations in Kolmogorov-Arnold Networks☆26Updated last month
- Playground for Transformers☆47Updated last year