hkproj / bert-from-scratchLinks
BERT explained from scratch
☆14Updated last year
Alternatives and similar repositories for bert-from-scratch
Users that are interested in bert-from-scratch are comparing it to the libraries listed below
Sorting:
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆117Updated 2 years ago
- ☆96Updated last year
- Distributed training (multi-node) of a Transformer model☆84Updated last year
- Notes on quantization in neural networks☆104Updated last year
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆105Updated 2 years ago
- ☆45Updated 4 months ago
- Building GPT ...☆18Updated 10 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆110Updated last year
- Mixed precision training from scratch with Tensors and CUDA☆28Updated last year
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆49Updated last year
- Prune transformer layers☆69Updated last year
- Notes on Direct Preference Optimization☆23Updated last year
- Code for studying the super weight in LLM☆120Updated 10 months ago
- An extension of the nanoGPT repository for training small MOE models.☆197Updated 7 months ago
- ☆187Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆72Updated 2 years ago
- LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.☆69Updated last year
- ☆86Updated last year
- Notes about LLaMA 2 model☆68Updated 2 years ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆343Updated 5 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆78Updated 11 months ago
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆66Updated 6 months ago
- LoRA and DoRA from Scratch Implementations☆211Updated last year
- LLM Workshop by Sourab Mangrulkar☆394Updated last year
- ☆209Updated 9 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated last year
- Fine-tune an LLM to perform batch inference and online serving.☆112Updated 4 months ago
- Various installation guides for Large Language Models☆73Updated 5 months ago
- LLaMA 2 implemented from scratch in PyTorch☆355Updated 2 years ago