BunsenFeng / FactKB
Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.
☆18Updated last year
Alternatives and similar repositories for FactKB:
Users that are interested in FactKB are comparing it to the libraries listed below
- AbstainQA, ACL 2024☆25Updated 3 months ago
- [EMNLP 2023] Knowledge Rumination for Pre-trained Language Models☆17Updated last year
- Generating diverse counterfactual data for Natural Language Understanding tasks using Large Language Models (LLMs). The generator support…☆36Updated last year
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- [EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.☆25Updated last year
- Data and code for ACL 2022 paper "MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data"☆42Updated 2 months ago
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆21Updated 2 years ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆33Updated last year
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"☆66Updated 2 years ago
- Evaluate the Quality of Critique☆35Updated 7 months ago
- ✨ Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆15Updated 3 months ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆58Updated last year
- Benchmarking Generalization to New Tasks from Natural Language Instructions☆26Updated 3 years ago
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆21Updated 11 months ago
- Adding new tasks to T0 without catastrophic forgetting☆32Updated 2 years ago
- Official implementation of our paper "Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration".☆12Updated 2 months ago
- [EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…☆18Updated 2 months ago
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆16Updated 2 years ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆66Updated 2 weeks ago
- "FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning" (ACL 2023)☆13Updated last year
- ☆85Updated last year
- Paper list of "The Life Cycle of Knowledge in Big Language Models: A Survey"☆60Updated last year
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- Personality Alignment of Language Models☆19Updated 4 months ago
- Towards Systematic Measurement for Long Text Quality☆31Updated 4 months ago
- EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975☆37Updated last year
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆40Updated 6 months ago
- ☆50Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆57Updated last year
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆31Updated 10 months ago