ethanyanjiali / minChatGPT
A minimum example of aligning language models with RLHF similar to ChatGPT
☆213Updated last year
Related projects ⓘ
Alternatives and complementary repositories for minChatGPT
- ☆94Updated last year
- A (somewhat) minimal library for finetuning language models with PPO on human feedback.☆86Updated last year
- RLHF implementation details of OAI's 2019 codebase☆152Updated 9 months ago
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆206Updated 10 months ago
- Recurrent Memory Transformer☆150Updated last year
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆393Updated 9 months ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆183Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆176Updated 3 months ago
- Rectified Rotary Position Embeddings☆339Updated 5 months ago
- Chain-of-Hindsight, A Scalable RLHF Method☆218Updated last year
- A large-scale, fine-grained, diverse preference dataset (and models).☆309Updated 10 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆381Updated 5 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆211Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆174Updated last year
- All available datasets for Instruction Tuning of Large Language Models☆236Updated 11 months ago
- ☆158Updated last year
- DSIR large-scale data selection framework for language model training☆227Updated 7 months ago
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆224Updated 2 months ago
- Code for fine-tuning Platypus fam LLMs using LoRA☆623Updated 9 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆435Updated 7 months ago
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆305Updated last year
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆433Updated 6 months ago
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆169Updated last year
- A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick☆287Updated 11 months ago
- [NIPS2023] RRHF & Wombat☆798Updated last year
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆216Updated 2 months ago
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆150Updated 2 months ago
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆528Updated 8 months ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆176Updated 8 months ago
- Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets☆304Updated 10 months ago