vicgalle / awesome-rlaifLinks
A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)
☆12Updated last year
Alternatives and similar repositories for awesome-rlaif
Users that are interested in awesome-rlaif are comparing it to the libraries listed below
Sorting:
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆13Updated 5 months ago
- ☆25Updated 4 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- 🤓 A collection of AWESOME structured summaries of Large Language Models (LLMs)☆27Updated last year
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 6 months ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated 2 years ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆29Updated 9 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- Reward Model framework for LLM RLHF☆61Updated 2 years ago
- ☆40Updated last week
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆35Updated 2 months ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆70Updated 11 months ago
- ☆43Updated 2 years ago
- ☆15Updated 2 months ago
- ☆49Updated 7 months ago
- Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.☆21Updated last year
- Measuring the situational awareness of language models☆35Updated last year
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated 10 months ago
- Official implementation of LoT paper: "Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic"☆25Updated last year
- Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs☆13Updated last year
- Explore visualization tools for understanding Transformer-based large language models (LLMs)☆12Updated 6 months ago
- FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists☆29Updated 2 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆65Updated last year
- ZYN: Zero-Shot Reward Models with Yes-No Questions☆34Updated last year
- ☆97Updated 11 months ago
- ☆12Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- ☆24Updated 8 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆58Updated 3 months ago
- Lottery Ticket Adaptation☆39Updated 6 months ago