deeplearning-wisc / args
☆33Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for args
- Rewarded soups official implementation☆51Updated last year
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆14Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆54Updated 2 weeks ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆97Updated 2 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆54Updated 3 months ago
- ☆81Updated last year
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆14Updated 3 weeks ago
- ☆20Updated 4 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆38Updated 10 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆48Updated 2 weeks ago
- ☆14Updated 8 months ago
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆19Updated 5 months ago
- ☆26Updated 6 months ago
- Directional Preference Alignment☆50Updated last month
- ☆24Updated 6 months ago
- Lightweight Adapting for Black-Box Large Language Models☆18Updated 9 months ago
- Code for ACL 2023 paper "BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases".☆19Updated last year
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]☆15Updated 6 months ago
- ☆24Updated last year
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆98Updated last year
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆46Updated 5 months ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆16Updated last month
- ☆44Updated 10 months ago
- [ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!☆29Updated 3 months ago
- ☆23Updated 6 months ago
- [EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards☆44Updated 6 months ago
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆68Updated 8 months ago
- ☆36Updated 3 months ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆28Updated 4 months ago
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆14Updated 5 months ago