knotgrass / How-Transformers-WorkLinks
π§ A study guide to learn about Transformers
β12Updated last year
Alternatives and similar repositories for How-Transformers-Work
Users that are interested in How-Transformers-Work are comparing it to the libraries listed below
Sorting:
- Tutorial for how to build BERT from scratchβ100Updated last year
- LLM Workshop by Sourab Mangrulkarβ398Updated last year
- Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.β274Updated last year
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)β161Updated last month
- LLaMA 2 implemented from scratch in PyTorchβ363Updated 2 years ago
- An extension of the nanoGPT repository for training small MOE models.β219Updated 9 months ago
- Distributed training (multi-node) of a Transformer modelβ90Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creationβ112Updated last year
- GPU Kernelsβ212Updated 8 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understandβ195Updated 6 months ago
- β99Updated last year
- Llama from scratch, or How to implement a paper without cryingβ581Updated last year
- Notes about LLaMA 2 modelβ71Updated 2 years ago
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultinβ¦β23Updated 2 years ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translationβ90Updated 5 months ago
- Notes and commented code for RLHF (PPO)β120Updated last year
- β228Updated 11 months ago
- LoRA and DoRA from Scratch Implementationsβ214Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)β¦β74Updated 2 years ago
- Best practices for distilling large language models.β595Updated last year
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.β444Updated 9 months ago
- β16Updated 8 months ago
- Reference implementation of Mistral AI 7B v0.1 model.β28Updated 2 years ago
- β225Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β245Updated last year
- a simplified version of Meta's Llama 3 model to be used for learningβ43Updated last year
- Annotations of the interesting ML papers I readβ269Updated 2 months ago
- β17Updated 2 years ago
- A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.β377Updated 5 months ago
- β406Updated 8 months ago