ethanyanjiali / minChatGPT
A minimum example of aligning language models with RLHF similar to ChatGPT
☆217Updated last year
Alternatives and similar repositories for minChatGPT:
Users that are interested in minChatGPT are comparing it to the libraries listed below
- A (somewhat) minimal library for finetuning language models with PPO on human feedback.☆85Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMs☆186Updated 7 months ago
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆172Updated last year
- RLHF implementation details of OAI's 2019 codebase☆184Updated last year
- ☆96Updated last year
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆407Updated 2 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆393Updated 10 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆218Updated last year
- Simple next-token-prediction for RLHF☆222Updated last year
- A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick☆289Updated last year
- DSIR large-scale data selection framework for language model training☆244Updated 11 months ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆180Updated last year
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- batched loras☆340Updated last year
- Scripts for fine-tuning Llama2 via SFT and DPO.☆195Updated last year
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆226Updated 6 months ago
- Scaling Data-Constrained Language Models☆335Updated 6 months ago
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆208Updated last year
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)☆364Updated 7 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆454Updated last year
- Rectified Rotary Position Embeddings☆361Updated 10 months ago
- Official PyTorch implementation of QA-LoRA☆129Updated last year
- Recurrent Memory Transformer☆149Updated last year
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆309Updated 2 years ago
- All available datasets for Instruction Tuning of Large Language Models☆247Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆801Updated 9 months ago
- ☆136Updated 4 months ago