S-Abdelnabi / LLM-Deliberation
Code for our NeurIPS'24 Dataset and Benchmark paper: Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation
☆25Updated 3 months ago
Alternatives and similar repositories for LLM-Deliberation:
Users that are interested in LLM-Deliberation are comparing it to the libraries listed below
- awesome-LLM-controlled-constrained-generation☆39Updated 6 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆127Updated 11 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated 11 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆100Updated 11 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆83Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆71Updated last year
- Critique-out-Loud Reward Models☆52Updated 4 months ago
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆19Updated 4 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆89Updated 8 months ago
- ☆40Updated last week
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆64Updated 8 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆125Updated 2 months ago
- Governance of the Commons Simulation (GovSim)☆38Updated last month
- ☆46Updated last month
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆84Updated last week
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆66Updated last year
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆73Updated last month
- Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization☆128Updated 9 months ago
- AbstainQA, ACL 2024☆25Updated 4 months ago
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆13Updated 6 months ago
- ☆59Updated 3 weeks ago
- Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner☆21Updated 7 months ago
- [ACL'24] Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements☆21Updated 5 months ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆51Updated 8 months ago
- [ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View☆110Updated 9 months ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆99Updated last year
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆66Updated 2 years ago
- Evaluate the Quality of Critique☆35Updated 8 months ago
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆48Updated last year
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆65Updated 2 weeks ago