princeton-nlp / NLProofS
EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443
☆83Updated 6 months ago
Alternatives and similar repositories for NLProofS:
Users that are interested in NLProofS are comparing it to the libraries listed below
- ☆58Updated 2 years ago
- ☆44Updated last year
- ☆82Updated last year
- Companion repo for "Evaluating Verifiability in Generative Search Engines".☆83Updated last year
- Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"☆22Updated 3 years ago
- ☆48Updated 2 years ago
- EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560☆58Updated 3 weeks ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 2 years ago
- A unified benchmark for math reasoning☆87Updated 2 years ago
- ☆38Updated last year
- ☆36Updated 11 months ago
- Few-shot NLP benchmark for unified, rigorous eval☆91Updated 2 years ago
- First explanation metric (diagnostic report) for text generation evaluation☆62Updated 2 weeks ago
- This project maintains a reading list for general text generation tasks☆65Updated 3 years ago
- Code and data for "Retrieval Enhanced Model for Commonsense Generation" (ACL-IJCNLP 2021).☆28Updated 3 years ago
- Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases☆29Updated 3 years ago
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"☆66Updated 3 years ago
- Authors' implementation of the paper Adaptive Information Seeking for Open-Domain Question Answering, published in EMNLP 2021.☆37Updated last year
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆45Updated 2 years ago
- [EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages☆23Updated 2 years ago
- FRANK: Factuality Evaluation Benchmark☆54Updated 2 years ago
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".☆64Updated last year
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆96Updated 2 years ago
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 3 years ago
- [NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240☆167Updated 2 years ago
- EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering☆70Updated 3 years ago
- ☆92Updated 2 years ago
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆71Updated 2 years ago
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20Updated 2 years ago
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year