ekinakyurek / gpt3-arithmetic
Scratchpad/Chain-of-Thought Prompts
☆12Updated 2 years ago
Related projects: ⓘ
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆42Updated 8 months ago
- code for "Natural Language to Code Translation with Execution"☆39Updated last year
- ☆20Updated last week
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆18Updated last year
- Repository for Skill Set Optimization☆12Updated last month
- Evaluate the Quality of Critique☆35Updated 3 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆38Updated 2 months ago
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆34Updated last year
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆17Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆27Updated this week
- Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".☆12Updated last year
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023…☆28Updated last year
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆11Updated 2 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆33Updated 6 months ago
- ☆25Updated last month
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆69Updated 2 years ago
- ☆23Updated 2 weeks ago
- Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.☆58Updated 2 months ago
- ☆80Updated 9 months ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆16Updated last year
- A zero-shot neural semantic parser without using annotated parallel training data.☆8Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆73Updated 5 months ago
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆15Updated 3 weeks ago
- Supporting code for ReCEval paper☆26Updated this week
- A framework for few-shot evaluation of autoregressive language models.☆23Updated 9 months ago
- ☆18Updated 3 months ago
- ☆24Updated 6 months ago
- A unified benchmark for math reasoning☆87Updated last year
- Adding new tasks to T0 without catastrophic forgetting☆30Updated last year