princeton-nlp / NLProofSLinks
EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443
☆86Updated 9 months ago
Alternatives and similar repositories for NLProofS
Users that are interested in NLProofS are comparing it to the libraries listed below
Sorting:
- ☆44Updated last year
- A unified benchmark for math reasoning☆88Updated 2 years ago
- ☆82Updated 2 years ago
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆73Updated 2 years ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 3 years ago
- EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560☆58Updated 3 months ago
- ☆58Updated 3 years ago
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20Updated 3 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆78Updated last year
- Companion repo for "Evaluating Verifiability in Generative Search Engines".☆83Updated 2 years ago
- ☆48Updated last year
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 4 years ago
- Code and data for "Retrieval Enhanced Model for Commonsense Generation" (ACL-IJCNLP 2021).☆28Updated 3 years ago
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆109Updated 3 years ago
- ☆39Updated 2 years ago
- The code of Paper "Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text".☆44Updated 2 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Updated 2 years ago
- Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”☆62Updated 4 years ago
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- ☆33Updated 2 years ago
- Code for Editing Factual Knowledge in Language Models☆138Updated 3 years ago
- First explanation metric (diagnostic report) for text generation evaluation☆62Updated 3 months ago
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".☆65Updated 2 years ago
- This project maintains a reading list for general text generation tasks☆65Updated 3 years ago
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆45Updated 3 years ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.☆34Updated last year
- ☆75Updated last year
- ☆36Updated last year
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago