saqib1707 / gpt2-from-scratchLinks
PyTorch Implementation of GPT-2
☆10Updated 9 months ago
Alternatives and similar repositories for gpt2-from-scratch
Users that are interested in gpt2-from-scratch are comparing it to the libraries listed below
Sorting:
- Recover Wi-Fi Password Using CMD, Windows PowerShell☆12Updated last year
- ☆54Updated last week
- Quantization of LLMs and benchmarking.☆10Updated last year
- LLM training in simple, C++/CUDA(with Eigen3)☆14Updated 9 months ago
- Triton implementation of GPT/LLAMA☆18Updated 9 months ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆105Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆329Updated last year
- ☆31Updated 11 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆185Updated this week
- ☆36Updated 2 weeks ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆46Updated last year
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Updated 2 weeks ago
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆196Updated last month
- Personal GitHub profile showcasing AI, machine learning, and software development expertise.☆10Updated 3 weeks ago
- making the official triton tutorials actually comprehensible☆36Updated 2 months ago
- Tutorial for how to build BERT from scratch☆93Updated last year
- ☆169Updated 5 months ago
- MIRA - Multimodal Image Reconstruction with Attention is a transformer (Encoder-Decoder) based architecture for Text / Image to 3D recons…☆13Updated last year
- documentation for content creation☆199Updated this week
- From scratch implementation of a vision language model in pure PyTorch☆220Updated last year
- Simple Byte pair Encoding mechanism used for tokenization process . written purely in C☆132Updated 6 months ago
- Tutorials for Triton, a language for writing gpu kernels☆18Updated last year
- a simplified version of Google's Gemma model to be used for learning☆25Updated last year
- ☆14Updated last month
- Pytorch/XLA SPMD Test code in Google TPU☆23Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆301Updated last month
- Experimenting with small language models☆67Updated last year
- ☆178Updated 5 months ago
- Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.☆42Updated last year