coaxsoft / pytorch_bert
Tutorial for how to build BERT from scratch
☆86Updated 7 months ago
Alternatives and similar repositories for pytorch_bert:
Users that are interested in pytorch_bert are comparing it to the libraries listed below
- LLaMA 2 implemented from scratch in PyTorch☆280Updated last year
- Prune transformer layers☆67Updated 7 months ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆92Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆56Updated last year
- Some notebooks for NLP☆189Updated last year
- Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.☆233Updated 9 months ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆151Updated 7 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 7 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- Early solution for Google AI4Code competition☆76Updated 2 years ago
- LoRA and DoRA from Scratch Implementations☆194Updated 10 months ago
- Efficient Attention for Long Sequence Processing☆91Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆66Updated 2 months ago
- ☆95Updated 2 weeks ago
- ☆76Updated 3 months ago
- a simplified version of Meta's Llama 3 model to be used for learning☆38Updated 7 months ago
- An open collection of implementation tips, tricks and resources for training large language models☆467Updated last year
- Distributed training (multi-node) of a Transformer model☆49Updated 9 months ago
- ☆39Updated 2 years ago
- A minimum example of aligning language models with RLHF similar to ChatGPT☆215Updated last year
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆204Updated 7 months ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆87Updated last year
- I will build Transformer from scratch☆52Updated 8 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆99Updated 3 months ago
- Recurrent Memory Transformer☆148Updated last year
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆439Updated 8 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆172Updated 4 months ago
- 🧠 A study guide to learn about Transformers☆10Updated last year
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆171Updated last year