activatedgeek / calibration-tuning
☆30Updated last month
Related projects: ⓘ
- ☆22Updated 2 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆40Updated 8 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆24Updated last month
- Self-Explore to avoid ️the p️️it! Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards☆39Updated 4 months ago
- Tasks for describing differences between text distributions.☆15Updated last month
- Benchmarking Benchmark Leakage in Large Language Models☆39Updated 4 months ago
- ☆24Updated 6 months ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆16Updated last month
- 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆52Updated 3 weeks ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆38Updated 2 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆46Updated 5 months ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆22Updated 3 months ago
- Directional Preference Alignment☆44Updated 3 months ago
- ☆14Updated 6 months ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆33Updated 3 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; arXiv preprint arXiv:2403.…☆34Updated 2 months ago
- ☆25Updated 3 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆33Updated last month
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆28Updated 6 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆37Updated 2 months ago
- Few-shot Learning with Auxiliary Data☆26Updated 9 months ago
- This repository contains data, code and models for contextual noncompliance.☆17Updated 2 months ago
- Restore safety in fine-tuned language models through task arithmetic☆25Updated 5 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆81Updated 2 weeks ago
- ☆24Updated 4 months ago
- [ACL 2024 NLP4ConvAI Oral] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system m…☆33Updated 3 months ago
- Teaching Models to Express Their Uncertainty in Words☆36Updated 2 years ago
- ☆29Updated 10 months ago
- ☆44Updated 11 months ago
- ☆44Updated 2 weeks ago