ekinakyurek / gpt3-arithmeticLinks
Scratchpad/Chain-of-Thought Prompts
☆12Updated 2 years ago
Alternatives and similar repositories for gpt3-arithmetic
Users that are interested in gpt3-arithmetic are comparing it to the libraries listed below
Sorting:
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆38Updated 2 years ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19Updated 2 years ago
- Official implementation of AAAI 2025 paper "Augmenting Math Word Problems via Iterative Question Composing"(https://arxiv.org/abs/2401.09…☆20Updated 5 months ago
- ☆24Updated 8 months ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 4 months ago
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆53Updated 9 months ago
- ☆44Updated 9 months ago
- A unified benchmark for math reasoning☆88Updated 2 years ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆48Updated last year
- ☆24Updated 7 months ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆78Updated last year
- [ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation Tasks☆51Updated last year
- Supporting code for ReCEval paper☆28Updated 8 months ago
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆19Updated 2 years ago
- About The corresponding code from our paper " REFINER: Reasoning Feedback on Intermediate Representations" (EACL 2024). Do not hesitate t…☆70Updated last year
- ☆39Updated 11 months ago
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"☆77Updated 2 years ago
- ☆21Updated 3 years ago
- [EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…☆24Updated 6 months ago
- Evaluate the Quality of Critique☆35Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆24Updated last year
- ☆75Updated 2 months ago
- CodeUltraFeedback: aligning large language models to coding preferences☆71Updated 11 months ago
- Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs☆36Updated last year
- The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)☆32Updated last year
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".☆69Updated last year
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆27Updated last month
- ☆27Updated last year
- SatLM: SATisfiability-Aided Language Models using Declarative Prompting (NeurIPS 2023)☆48Updated 10 months ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆57Updated 2 years ago