friederrr / GHOSTS
GHOSTS dataset
☆38Updated last year
Alternatives and similar repositories for GHOSTS:
Users that are interested in GHOSTS are comparing it to the libraries listed below
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆71Updated 2 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Language Model Decoding as Likelihood–Utility Alignm…☆14Updated last year
- Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".☆13Updated 2 years ago
- Neural Unification for Logic Reasoning over Language☆22Updated 3 years ago
- ☆43Updated 2 years ago
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20Updated 2 years ago
- Weakly Supervised Text-to-SQL Parsing through Question Decomposition☆22Updated last year
- EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443☆83Updated 5 months ago
- Official code for paper LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning☆28Updated 3 years ago
- Code for MAWPS: A Math Word Problem Repository☆40Updated last year
- ☆105Updated 2 years ago
- ☆10Updated 4 years ago
- [EMNLP 2021] Dataset and PyTorch Code for ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning☆11Updated 2 years ago
- Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"☆31Updated last year
- Code for Stage-wise Fine-tuning for Graph-to-Text Generation☆26Updated 2 years ago
- [Work in progress] A reading list for machine commonsense reasoning☆33Updated 4 years ago
- MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency☆29Updated last year
- ☆44Updated last year
- ☆44Updated last year
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆77Updated 10 months ago
- A unified benchmark for math reasoning☆87Updated 2 years ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 8 months ago
- Benchmarking Generalization to New Tasks from Natural Language Instructions☆26Updated 3 years ago
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆44Updated 3 years ago
- Companion repo for "Evaluating Verifiability in Generative Search Engines".☆83Updated last year
- Query-focused summarization data☆41Updated 2 years ago
- OpenPI dataset for tracking entities in open domain procedural text☆22Updated 6 months ago
- [ICLR 2023] PyTorch code of Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees☆23Updated last year
- Supporting code for ReCEval paper☆28Updated 5 months ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆43Updated 2 years ago