tranquoctrinh / transformerLinks
This is a PyTorch implementation of the Transformer model in the paper Attention is All You Need
☆36Updated 10 months ago
Alternatives and similar repositories for transformer
Users that are interested in transformer are comparing it to the libraries listed below
Sorting:
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆122Updated 2 years ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆235Updated last year
- Code Transformer neural network components piece by piece☆372Updated 2 years ago
- Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.☆282Updated last year
- Attention Is All You Need | a PyTorch Tutorial to Transformers☆362Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆366Updated 2 years ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆181Updated 6 months ago
- ☆48Updated 7 months ago
- Attention is all you need implementation☆1,164Updated last year
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆130Updated last year
- A numpy implementation of the Transformer model in "Attention is All You Need"☆58Updated last year
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆206Updated 3 weeks ago
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆402Updated last year
- Implementations of a Mixture-of-Experts (MoE) architecture designed for research on large language models (LLMs) and scalable neural netw…☆55Updated 10 months ago
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations☆189Updated last year
- Personal short implementations of Machine Learning papers☆252Updated 2 years ago
- This repository contains an exhaustive coverage of a hands on approach to PyTorch along side powerful tools to accelerate model tuning an…☆226Updated last week
- ☆62Updated last year
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)☆162Updated 2 months ago
- Variations of Kolmogorov-Arnold Networks☆116Updated last year
- ☆64Updated 3 years ago
- ☆13Updated last year
- Distributed training (multi-node) of a Transformer model☆93Updated last year
- Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examples☆204Updated 8 months ago
- Mixture of Experts from scratch☆12Updated last year
- Code and written solutions of the assignments of the Stanford CS224N: Natural Language Processing with Deep Learning course from winter 2…☆269Updated last year
- a simplified version of Meta's Llama 3 model to be used for learning☆44Updated last year
- Natural Language Processing Courses with Resources☆42Updated 4 months ago
- 🦍 Stanford CS236 : Deep Generative Models☆158Updated 7 years ago
- Notes on quantization in neural networks☆117Updated 2 years ago