knotgrass / How-Transformers-Work
🧠A study guide to learn about Transformers
☆10Updated last year
Alternatives and similar repositories for How-Transformers-Work:
Users that are interested in How-Transformers-Work are comparing it to the libraries listed below
- Tutorial for how to build BERT from scratch☆87Updated 9 months ago
- ☆80Updated 4 months ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translation☆74Updated last week
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆67Updated 4 months ago
- Prune transformer layers☆67Updated 8 months ago
- LLaMA 2 implemented from scratch in PyTorch☆294Updated last year
- Code example for pretraining an LLM with vanilla PyTorch training loop☆11Updated 8 months ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆92Updated last year
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation☆33Updated 11 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆203Updated 3 months ago
- Training and Fine-tuning an llm in Python and PyTorch.☆41Updated last year
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆59Updated 2 weeks ago
- ☆85Updated 2 months ago
- ☆128Updated last month
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)☆144Updated 8 months ago
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆61Updated last year
- ☆42Updated 3 years ago
- A minimum example of aligning language models with RLHF similar to ChatGPT☆217Updated last year
- experiments with inference on llama☆104Updated 8 months ago
- Pre-training code for Amber 7B LLM☆162Updated 9 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆451Updated 11 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆103Updated 4 months ago
- ☆16Updated last month
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 8 months ago
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆91Updated last year
- Notes and commented code for RLHF (PPO)☆69Updated 11 months ago
- Easy and Efficient Quantization for Transformers☆193Updated 2 weeks ago
- Notes about LLaMA 2 model☆53Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆87Updated last year