facebookresearch / PerSE
Personalized Story Evaluation Model
☆11Updated last year
Alternatives and similar repositories for PerSE
Users that are interested in PerSE are comparing it to the libraries listed below
Sorting:
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆17Updated 2 years ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆36Updated 9 months ago
- Official Code for EMNLP2023 Main Conference paper: "KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detec…☆30Updated last year
- Code for ACL 2024 paper "Soft Self-Consistency Improves Language Model Agents"☆19Updated 8 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆87Updated 6 months ago
- ☆34Updated 2 years ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆68Updated 2 years ago
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆100Updated 2 years ago
- ☆32Updated last month
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆68Updated last year
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆74Updated 11 months ago
- ☆14Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆57Updated 5 months ago
- ☆21Updated 2 years ago
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆45Updated 5 months ago
- AbstainQA, ACL 2024☆25Updated 7 months ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆22Updated 6 months ago
- BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages☆31Updated last week
- ☆75Updated last year
- The git repository of Modular Prompted Chatbot paper☆33Updated last year
- ☆9Updated last year
- HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.☆33Updated 6 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆61Updated 10 months ago
- ☆13Updated 2 years ago
- [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆75Updated 5 months ago
- Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)☆53Updated this week
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Updated 11 months ago
- ☆15Updated 2 years ago
- ☆28Updated last year