TianduoWang / MsAT
[ACL 2023] Learning Multi-step Reasoning by Solving Arithmetic Tasks. https://arxiv.org/abs/2306.01707
☆24Updated last year
Alternatives and similar repositories for MsAT:
Users that are interested in MsAT are comparing it to the libraries listed below
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆17Updated last year
- Official codebase for “In-Context Learning with Many Demonstration Examples”☆16Updated 2 years ago
- ☆75Updated last year
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆22Updated this week
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆58Updated last year
- ☆33Updated 2 years ago
- Official implementation of the ACL 2023 paper: "Zero-shot Faithful Factual Error Correction"☆17Updated last year
- ☆14Updated 2 years ago
- ☆43Updated last year
- ☆28Updated last year
- ☆82Updated last year
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆105Updated 6 months ago
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation☆20Updated last week
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- ☆66Updated 2 months ago
- ☆15Updated 9 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆54Updated 8 months ago
- ☆85Updated 2 years ago
- Official Code for EMNLP2023 Main Conference paper: "KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detec…☆30Updated last year
- ☆86Updated last year
- ☆15Updated 4 months ago
- ☆30Updated 10 months ago
- ☆72Updated 9 months ago
- ☆17Updated last year
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆99Updated last year
- Analyzing LLM Alignment via Token distribution shift☆15Updated last year
- ☆52Updated 6 months ago
- Revisiting Cross-Lingual Summarization: A Corpus-based Study and A New Benchmark with Improved Annotation☆18Updated 11 months ago
- ☆61Updated 2 years ago
- ☆16Updated last year