anthropics / hh-rlhfLinks

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

☆1,766

Alternatives and similar repositories for hh-rlhf

Users that are interested in hh-rlhf are comparing it to the libraries listed below

Sorting:

tatsu-lab / alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
☆1,816Updated 7 months ago
tatsu-lab / alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
☆821Updated last year
allenai / RL4LMs
A modular RL library to fine-tune language models to human preferences
☆2,332Updated last year
google-research / FLAN
☆1,532Updated 3 weeks ago
allenai / natural-instructions
Expanding natural instructions
☆1,010Updated last year
FranxYao / chain-of-thought-hub
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
☆2,743Updated last year
yaodongC / awesome-instruction-dataset
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
☆1,127Updated last year
openai / prm800k
800,000 step-level correctness labels on LLM solutions to MATH problems
☆2,032Updated 2 years ago
hendrycks / test
Measuring Massive Multitask Language Understanding | ICLR 2021
☆1,464Updated 2 years ago
openai / lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
☆1,350Updated 2 years ago
stanford-crfm / helm
Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …
☆2,367Updated this week
CarperAI / trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,688Updated last year
sylinrl / TruthfulQA
TruthfulQA: Measuring How Models Imitate Human Falsehoods
☆781Updated 6 months ago
EleutherAI / pythia
The hub for EleutherAI's work on interpretability and learning dynamics
☆2,575Updated last month
ruixiangcui / AGIEval
☆758Updated last year
yuchenlin / LLM-Blender
[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…
☆952Updated 9 months ago
keirp / automatic_prompt_engineer
☆1,285Updated last year
SinclairCoder / Instruction-Tuning-Papers
Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).
☆769Updated 2 years ago
google-research / prompt-tuning
Original Implementation of Prompt Tuning from Lester, et al, 2021
☆689Updated 4 months ago
lucidrains / toolformer-pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
☆2,042Updated last year
AGI-Edgerunners / LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
☆1,184Updated last year
gururise / AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
☆1,563Updated 2 years ago
IBM / Dromedary
Dromedary: towards helpful, ethical and reliable LLMs.
☆1,148Updated 2 months ago
bigscience-workshop / promptsource
Toolkit for creating, sharing and using natural language prompts.
☆2,914Updated last year
suzgunmirac / BIG-Bench-Hard
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
☆504Updated last year
openai / automated-interpretability
☆1,025Updated last year
AetherCortex / Llama-X
Open Academic Research on Improving LLaMA to SOTA LLM
☆1,619Updated last year
GanjinZero / RRHF
[NIPS2023] RRHF & Wombat
☆811Updated last year
eric-mitchell / direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
☆2,678Updated 11 months ago
GaryYufei / AlignLLMHumanSurvey
Aligning Large Language Models with Human: A Survey
☆731Updated last year