vicgalle / awesome-rlaif
A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)
☆12Updated last year
Alternatives and similar repositories for awesome-rlaif
Users that are interested in awesome-rlaif are comparing it to the libraries listed below
Sorting:
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆13Updated 5 months ago
- Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs☆13Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Explore visualization tools for understanding Transformer-based large language models (LLMs)☆12Updated 5 months ago
- ☆97Updated 10 months ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24Updated 2 years ago
- ☆48Updated 6 months ago
- Minimum Description Length probing for neural network representations☆19Updated 3 months ago
- Generating and validating natural-language explanations for the brain.☆52Updated last month
- Reasoning by Communicating with Agents☆28Updated 2 weeks ago
- ☆27Updated last year
- 🤓 A collection of AWESOME structured summaries of Large Language Models (LLMs)☆27Updated last year
- The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"☆20Updated last year
- ☆15Updated last month
- Open Implementations of LLM Analyses☆102Updated 7 months ago
- ☆42Updated last year
- ☆69Updated 3 months ago
- ☆20Updated 5 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆75Updated last year
- Measuring the situational awareness of language models☆34Updated last year
- Lottery Ticket Adaptation☆39Updated 5 months ago
- Codebase for Inference-Time Policy Adapters☆23Updated last year
- ☆25Updated 4 months ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆29Updated 8 months ago
- ZYN: Zero-Shot Reward Models with Yes-No Questions☆33Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆85Updated last month
- Evaluating the Moral Beliefs Encoded in LLMs☆26Updated 4 months ago
- ☆46Updated this week
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 6 months ago