anthropics / DecompositionFaithfulnessPaper
☆23Updated last year
Alternatives and similar repositories for DecompositionFaithfulnessPaper:
Users that are interested in DecompositionFaithfulnessPaper are comparing it to the libraries listed below
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- ☆27Updated 2 weeks ago
- NeurIPS 2024 tutorial on LLM Inference☆37Updated last month
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated 11 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆57Updated this week
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆42Updated last year
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆51Updated 9 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated 7 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated 10 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆46Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 4 months ago
- ☆48Updated 11 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆53Updated 10 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆66Updated last year
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆58Updated last year
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆40Updated 3 weeks ago
- Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment☆38Updated last year
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆31Updated 11 months ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆23Updated last month
- On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆36Updated 2 months ago
- About The corresponding code from our paper " REFINER: Reasoning Feedback on Intermediate Representations" (EACL 2024). Do not hesitate t…☆69Updated 10 months ago
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆35Updated last year
- Evaluate the Quality of Critique☆35Updated 7 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆77Updated 5 months ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 4 months ago
- ☆20Updated 7 months ago
- Reasoning by Communicating with Agents☆23Updated 3 months ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆64Updated 6 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆52Updated 7 months ago