hkproj / bert-from-scratchLinks
BERT explained from scratch
☆14Updated last year
Alternatives and similar repositories for bert-from-scratch
Users that are interested in bert-from-scratch are comparing it to the libraries listed below
Sorting:
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- ☆90Updated 10 months ago
- Prune transformer layers☆69Updated last year
- Notes on quantization in neural networks☆92Updated last year
- ☆43Updated 2 months ago
- Distributed training (multi-node) of a Transformer model☆75Updated last year
- Notes on Direct Preference Optimization☆21Updated last year
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆112Updated 2 years ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆110Updated 10 months ago
- Mixed precision training from scratch with Tensors and CUDA☆24Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆69Updated last year
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆48Updated 10 months ago
- An extension of the nanoGPT repository for training small MOE models.☆163Updated 4 months ago
- Code for studying the super weight in LLM☆114Updated 7 months ago
- ☆182Updated 6 months ago
- LLaMA 2 implemented from scratch in PyTorch☆343Updated last year
- LLM Workshop by Sourab Mangrulkar☆387Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆323Updated 2 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆188Updated 2 months ago
- ☆86Updated last year
- Notes about LLaMA 2 model☆66Updated last year
- ☆203Updated 5 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆77Updated 9 months ago
- making the official triton tutorials actually comprehensible☆53Updated last week
- minimal GRPO implementation from scratch☆94Updated 4 months ago
- ☆181Updated last year
- LoRA and DoRA from Scratch Implementations☆207Updated last year
- LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.☆69Updated last year
- GPU Kernels☆191Updated 3 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated last year