xrsrke / instructGOOSELinks

Implementation of Reinforcement Learning from Human Feedback (RLHF)

☆172

Alternatives and similar repositories for instructGOOSE

Users that are interested in instructGOOSE are comparing it to the libraries listed below

Sorting:

Dahoas / reward-modeling
☆98Updated 2 years ago
thomfoster / minRLHF
A (somewhat) minimal library for finetuning language models with PPO on human feedback.
☆87Updated 2 years ago
tomekkorbak / pretraining-with-human-feedback
Code accompanying the paper Pretraining Language Models with Human Preferences
☆180Updated last year
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆226Updated 2 years ago
tianjunz / HIR
☆159Updated 2 years ago
orhonovich / unnatural-instructions
☆179Updated 2 years ago
Langboat / mengzi-retrieval-lm
An experimental implementation of the retrieval-enhanced language model
☆75Updated 2 years ago
akoksal / LongForm
Reverse Instructions to generate instruction tuning data with corpus examples
☆214Updated last year
bigscience-workshop / t-zero
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
☆462Updated 2 years ago
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆209Updated last year
declare-lab / flan-alpaca
This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…
☆356Updated 2 years ago
ethanyanjiali / minChatGPT
A minimum example of aligning language models with RLHF similar to ChatGPT
☆223Updated 2 years ago
yizhongw / Tk-Instruct
Tk-Instruct is a Transformer model that is tuned to solve many NLP tasks by following instructions.
☆181Updated 2 years ago
voidful / TextRL
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-…
☆565Updated last year
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆217Updated 2 years ago
lxe / llama-tune
LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers
☆51Updated 2 years ago
seonghyeonye / TAPP
[AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
☆78Updated last year
raunak-agarwal / instruction-datasets
Datasets for Instruction Tuning of Large Language Models
☆257Updated last year
bhargaviparanjape / language-programmes
☆173Updated 2 years ago
bigscience-workshop / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆104Updated 2 years ago
anthropics / ConstitutionalHarmlessnessPaper
☆242Updated 2 years ago
corolla-johnson / mkultra
Prompt tuning toolkit for GPT-2 and GPT-Neo
☆88Updated 4 years ago
sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆293Updated last year
FranxYao / GPT-Bargaining
Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback
☆207Updated 2 years ago
linhduongtuan / BLOOM-LORA
Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…
☆184Updated 2 years ago
ypeleg / llama
User-friendly LLaMA: Train or Run the model using PyTorch. Nothing else.
☆338Updated 2 years ago
zphang / minimal-llama
☆457Updated 2 years ago
yangkevin2 / emnlp22-re3-story-generation
☆253Updated 2 years ago
xrsrke / toolformer
Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools
☆142Updated 2 years ago
vihangd / alpaca-qlora
Instruct-tune Open LLaMA / RedPajama / StableLM models on consumer hardware using QLoRA
☆81Updated last year