HumanSignal / RLHF
Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI models
☆189Updated last year
Related projects ⓘ
Alternatives and complementary repositories for RLHF
- A set of scripts and notebooks on LLM finetunning and dataset creation☆92Updated last month
- A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.☆314Updated last year
- Let's build better datasets, together!☆202Updated 3 months ago
- Scripts for fine-tuning Llama2 via SFT and DPO.☆179Updated last year
- awesome synthetic (text) datasets☆239Updated last week
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …☆239Updated last year
- Code accompanying the paper Pretraining Language Models with Human Preferences☆176Updated 8 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆237Updated 3 months ago
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆169Updated last year
- Chain-of-Hindsight, A Scalable RLHF Method☆218Updated last year
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆738Updated last week
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆457Updated last month
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆211Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 3 weeks ago
- RewardBench: the first evaluation tool for reward models.☆424Updated 2 weeks ago
- Official repository for ORPO☆420Updated 5 months ago
- An extensible benchmark for evaluating large language models on planning☆288Updated 5 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆73Updated 2 months ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆115Updated last year
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆188Updated 2 months ago
- The official evaluation suite and dynamic data release for MixEval.☆222Updated last week
- ☆445Updated last week
- This project studies the performance and robustness of language models and task-adaptation methods.☆141Updated 5 months ago
- Generative Representational Instruction Tuning☆562Updated this week
- A (somewhat) minimal library for finetuning language models with PPO on human feedback.☆86Updated last year
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆210Updated 10 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆114Updated last month
- Finetune mistral-7b-instruct for sentence embeddings☆70Updated 6 months ago
- A bagel, with everything.☆312Updated 6 months ago
- A framework for few-shot evaluation of autoregressive language models.☆101Updated last year