allenai / numglueLinks
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
☆20Updated 3 years ago
Alternatives and similar repositories for numglue
Users that are interested in numglue are comparing it to the libraries listed below
Sorting:
- ☆45Updated last year
- Companion repo for "Evaluating Verifiability in Generative Search Engines".☆83Updated 2 years ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 3 years ago
- EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443☆86Updated 10 months ago
- The official code of TACL 2021, "Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies".☆75Updated 2 years ago
- Code for paper "CrossFit : A Few-shot Learning Challenge for Cross-task Generalization in NLP" (https://arxiv.org/abs/2104.08835)☆111Updated 3 years ago
- A unified benchmark for math reasoning☆88Updated 2 years ago
- ☆82Updated 2 years ago
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆73Updated 3 years ago
- Distributional Generalization in NLP. A roadmap.☆88Updated 2 years ago
- ☆36Updated last year
- [EMNLP 2021] Dataset and PyTorch Code for ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning☆12Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆78Updated last year
- A unified approach to explain conditional text generation models. Pytorch. The code of paper "Local Explanation of Dialogue Response Gene…☆17Updated 3 years ago
- ☆58Updated 3 years ago
- Data and Code Release for "On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries"☆53Updated 4 years ago
- Implementation of the paper: "Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning…☆22Updated 3 years ago
- Code for the paper "Simulating Bandit Learning from User Feedback for Extractive Question Answering".☆18Updated 2 years ago
- ☆35Updated 3 years ago
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"☆77Updated 2 years ago
- Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases☆29Updated 3 years ago
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆40Updated 2 years ago
- Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"☆22Updated 3 years ago
- Findings of ACL'2023: Optimizing Test-Time Query Representations for Dense Retrieval☆30Updated last year
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 4 years ago
- ☆87Updated 2 years ago
- Code Repo for "Differentiable Open-Ended Commonsense Reasoning" (NAACL 2021)☆32Updated 2 years ago
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"☆72Updated 3 years ago
- TBC☆27Updated 2 years ago