coaxsoft / pytorch_bert
Tutorial for how to build BERT from scratch
☆90Updated 10 months ago
Alternatives and similar repositories for pytorch_bert:
Users that are interested in pytorch_bert are comparing it to the libraries listed below
- Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.☆239Updated 11 months ago
- A (somewhat) minimal library for finetuning language models with PPO on human feedback.☆85Updated 2 years ago
- ☆80Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆307Updated last year
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆99Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆195Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆105Updated 6 months ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆172Updated 9 months ago
- Notes and commented code for RLHF (PPO)☆77Updated last year
- Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI m…☆216Updated last year
- ☆17Updated 2 months ago
- A minimum example of aligning language models with RLHF similar to ChatGPT☆217Updated last year
- Fine tune a T5 transformer model using PyTorch & Transformers🤗☆209Updated 4 years ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆73Updated 5 months ago
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆63Updated last year
- Distributed training (multi-node) of a Transformer model☆62Updated 11 months ago
- a simplified version of Meta's Llama 3 model to be used for learning☆41Updated 10 months ago
- ☆45Updated 3 years ago
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆193Updated this week
- LoRA and DoRA from Scratch Implementations☆198Updated last year
- Code Transformer neural network components piece by piece☆338Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆80Updated 10 months ago
- A numpy implementation of the Transformer model in "Attention is All You Need"☆54Updated 8 months ago
- Playground for Transformers☆48Updated last year
- Efficient Attention for Long Sequence Processing☆92Updated last year
- Prune transformer layers☆68Updated 9 months ago
- ☆96Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- ☆68Updated 2 years ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆253Updated 8 months ago