ruizheng20 / gpoLinks
The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".
☆17Updated last year
Alternatives and similar repositories for gpo
Users that are interested in gpo are comparing it to the libraries listed below
Sorting:
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆123Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆78Updated 6 months ago
- ☆44Updated last year
- ☆36Updated last year
- Teaching Models to Express Their Uncertainty in Words☆39Updated 3 years ago
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆47Updated last year
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆27Updated last year
- ☆101Updated last year
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆28Updated last year
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆25Updated last year
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆81Updated 9 months ago
- ☆46Updated 2 years ago
- Self-Supervised Alignment with Mutual Information☆21Updated last year
- Official repository for ALT (ALignment with Textual feedback).☆10Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆17Updated last year
- ☆52Updated 4 months ago
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆41Updated 4 months ago
- Lightweight Adapting for Black-Box Large Language Models☆23Updated last year
- Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner☆28Updated last year
- [ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…☆27Updated 11 months ago
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆16Updated 8 months ago
- ☆38Updated last year
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆62Updated 2 years ago
- ☆75Updated last year
- Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"☆19Updated 3 months ago
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆13Updated 11 months ago
- [NeurIPS 2023 D&B Track] Code and data for paper "Revisiting Out-of-distribution Robustness in NLP: Benchmarks, Analysis, and LLMs Evalua…☆35Updated 2 years ago
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆96Updated 4 years ago
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆49Updated 2 years ago
- ☆100Updated last year