cambridgeltl / PairSView external linksLinks
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)
☆47Jan 21, 2025Updated last year
Alternatives and similar repositories for PairS
Users that are interested in PairS are comparing it to the libraries listed below
Sorting:
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆14Oct 3, 2024Updated last year
- HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.☆35Oct 15, 2024Updated last year
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆26Jul 23, 2025Updated 6 months ago
- ☆10Oct 22, 2024Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- [NeurIPS 2023 D&B Track] Code and data for paper "Revisiting Out-of-distribution Robustness in NLP: Benchmarks, Analysis, and LLMs Evalua…☆36Jun 8, 2023Updated 2 years ago
- Code for "Bridging the Gap between f-GANs and Wasserstein GANs", ICML 2020☆14Jul 18, 2020Updated 5 years ago
- Creating Generative AI Apps which work☆17Apr 14, 2025Updated 10 months ago
- ☆18Aug 19, 2024Updated last year
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16May 3, 2022Updated 3 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- ☆16Jul 23, 2024Updated last year
- Repo for "On Learning to Summarize with Large Language Models as References"☆43May 24, 2023Updated 2 years ago
- ☆20Nov 3, 2024Updated last year
- ☆20Jan 8, 2026Updated last month
- [NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models☆17Jul 17, 2024Updated last year
- AutoHallusion Codebase (EMNLP 2024)☆22Dec 6, 2024Updated last year
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆21Dec 10, 2024Updated last year
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆51Oct 31, 2024Updated last year
- PyTorch code for System-1.x: Learning to Balance Fast and Slow Planning with Language Models☆24Jul 22, 2024Updated last year
- Customize, control, and enhance LLM generation with logits processors, featuring visualization capabilities to inspect and understand sta…☆44Jan 8, 2026Updated last month
- [ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO☆62Apr 30, 2025Updated 9 months ago
- Code for ACL 2020 paper: USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation (https://arxiv.org/pdf/2005.0045…☆49Dec 8, 2022Updated 3 years ago
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆54Oct 1, 2024Updated last year
- ☆22Oct 21, 2024Updated last year
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆28May 28, 2024Updated last year
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆58Nov 16, 2024Updated last year
- Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)☆28Mar 26, 2024Updated last year
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆37Feb 22, 2025Updated 11 months ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆31Jan 31, 2023Updated 3 years ago
- ☆63Mar 1, 2025Updated 11 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆35Apr 17, 2025Updated 10 months ago
- Few-shot Learning with Auxiliary Data☆31Dec 8, 2023Updated 2 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- Restore safety in fine-tuned language models through task arithmetic☆32Mar 28, 2024Updated last year
- Create a source of truth for ML model results and browse it on Papers with Code☆34Jun 9, 2021Updated 4 years ago
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 4 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆65Jan 11, 2025Updated last year