knotgrass / How-Transformers-WorkLinks
π§ A study guide to learn about Transformers
β11Updated last year
Alternatives and similar repositories for How-Transformers-Work
Users that are interested in How-Transformers-Work are comparing it to the libraries listed below
Sorting:
- Tutorial for how to build BERT from scratchβ93Updated last year
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)β151Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)β¦β67Updated last year
- Training and Fine-tuning an llm in Python and PyTorch.β42Updated last year
- LLaMA 2 implemented from scratch in PyTorchβ329Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creationβ111Updated 8 months ago
- β87Updated 8 months ago
- Distributed training (multi-node) of a Transformer modelβ68Updated last year
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budgetβ151Updated last year
- Notes and commented code for RLHF (PPO)β96Updated last year
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultinβ¦β23Updated last year
- An extension of the nanoGPT repository for training small MOE models.β147Updated 2 months ago
- Scripts for fine-tuning Llama2 via SFT and DPO.β200Updated last year
- minimal GRPO implementation from scratchβ90Updated 2 months ago
- a simplified version of Meta's Llama 3 model to be used for learningβ41Updated last year
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated 7 months ago
- Implementation of BERT-based Language Modelsβ19Updated last year
- LoRA and DoRA from Scratch Implementationsβ204Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β221Updated 7 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.β166Updated 9 months ago
- experiments with inference on llamaβ104Updated last year
- BERT explained from scratchβ13Updated last year
- Prune transformer layersβ69Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedbackβ96Updated last year
- Pre-training code for Amber 7B LLMβ166Updated last year
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.β122Updated last year
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.β170Updated last year
- β169Updated 5 months ago
- β83Updated last year
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuningβ89Updated last year