vicgalle / awesome-rlaifLinks
A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)
β12Updated last year
Alternatives and similar repositories for awesome-rlaif
Users that are interested in awesome-rlaif are comparing it to the libraries listed below
Sorting:
- Code for experiments on self-prediction as a way to measure introspection in LLMsβ14Updated 6 months ago
- π€ A collection of AWESOME structured summaries of Large Language Models (LLMs)β27Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Modelsβ¦β34Updated last year
- Reward Model framework for LLM RLHFβ61Updated 2 years ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response formatβ27Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"β81Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"β70Updated last year
- A forest of autonomous agents.β19Updated 5 months ago
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM datasetβ16Updated last year
- Lottery Ticket Adaptationβ39Updated 7 months ago
- A framework for pitting LLMs against each other in an evolving library of games ββ32Updated 2 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.β53Updated 3 months ago
- OpenPipe Reinforcement Learning Experimentsβ25Updated 3 months ago
- Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMsβ13Updated last year
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zetaβ13Updated 7 months ago
- β18Updated 2 months ago
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.β34Updated this week
- Learning to route instances for Human vs AI Feedback (ACL 2025 Main)β23Updated last month
- β51Updated 7 months ago
- NeurIPS 2024 tutorial on LLM Inferenceβ45Updated 6 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)β28Updated last year
- Official implementation of LoT paper: "Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic"β25Updated last year
- β21Updated last month
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Modelsβ58Updated 4 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectoriesβ18Updated last month
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language modβ¦β14Updated 2 years ago
- β96Updated last year
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.β26Updated 10 months ago
- Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning (Zhou et al.; EMNLP 2023 Findings)β17Updated last year
- Explore visualization tools for understanding Transformer-based large language models (LLMs)β13Updated 6 months ago