zankner / CLoud
Critique-out-Loud Reward Models
☆17Updated 2 weeks ago
Related projects: ⓘ
- Repository for Skill Set Optimization☆12Updated last month
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆40Updated 8 months ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆28Updated 3 months ago
- ☆24Updated 6 months ago
- [ACL 2024 NLP4ConvAI Oral] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system m…☆33Updated 3 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆14Updated 6 months ago
- Benchmarking Benchmark Leakage in Large Language Models☆39Updated 4 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆39Updated 7 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆42Updated 10 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆49Updated 3 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆24Updated last month
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆52Updated last month
- Code and Data for the NAACL 24 paper: MacGyver: Are Large Language Models Creative Problem Solvers?☆21Updated 5 months ago
- This repository contains some of the code used in the paper "Training Language Models with Langauge Feedback at Scale"☆26Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆25Updated 6 months ago
- ☆15Updated last month
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆28Updated last month
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆33Updated 3 months ago
- ☆19Updated 11 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆33Updated 6 months ago
- [NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.☆23Updated last year
- ☆27Updated 5 months ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆41Updated last month
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 8 months ago
- ☆16Updated 10 months ago
- Byte-sized text games for code generation tasks on virtual environments☆17Updated 2 months ago
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated 10 months ago
- ☆30Updated last month
- Directional Preference Alignment☆44Updated 3 months ago