ethanyanjiali / minChatGPT
A minimum example of aligning language models with RLHF similar to ChatGPT
☆217Updated last year
Alternatives and similar repositories for minChatGPT:
Users that are interested in minChatGPT are comparing it to the libraries listed below
- RLHF implementation details of OAI's 2019 codebase☆178Updated last year
- ☆96Updated last year
- A (somewhat) minimal library for finetuning language models with PPO on human feedback.☆86Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMs☆184Updated 6 months ago
- Simple next-token-prediction for RLHF☆222Updated last year
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆405Updated last month
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆171Updated last year
- ☆456Updated last year
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆389Updated 9 months ago
- DSIR large-scale data selection framework for language model training☆241Updated 10 months ago
- batched loras☆338Updated last year
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆185Updated last year
- Rectified Rotary Position Embeddings☆351Updated 9 months ago
- A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick☆289Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆242Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- Code for the ALiBi method for transformer language models (ICLR 2022)☆515Updated last year
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆173Updated 5 months ago
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆443Updated 10 months ago
- Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-…☆551Updated 9 months ago
- Pre-training code for Amber 7B LLM☆162Updated 9 months ago
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆794Updated 7 months ago
- Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets☆312Updated last year
- Recurrent Memory Transformer☆149Updated last year
- Experiments on speculative sampling with Llama models☆124Updated last year
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆451Updated 11 months ago
- Scaling Data-Constrained Language Models☆333Updated 5 months ago
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆208Updated last year
- LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA☆200Updated last year
- ☆105Updated last year