zhangir-azerbayev / proof-pile
Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.
☆19Updated 2 years ago
Alternatives and similar repositories for proof-pile:
Users that are interested in proof-pile are comparing it to the libraries listed below
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆37Updated 2 years ago
- ☆25Updated 8 months ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19Updated last year
- Official implementation of AAAI 2025 paper "Augmenting Math Word Problems via Iterative Question Composing"(https://arxiv.org/abs/2401.09…☆20Updated 4 months ago
- A framework for few-shot evaluation of autoregressive language models.☆24Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆30Updated 2 weeks ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated 2 years ago
- Evaluate the Quality of Critique☆34Updated 11 months ago
- Scratchpad/Chain-of-Thought Prompts☆12Updated 2 years ago
- ☆24Updated 6 months ago
- ☆33Updated last year
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆25Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- Revisiting Mid-training in the Era of RL Scaling☆36Updated 2 weeks ago
- code for "Natural Language to Code Translation with Execution"☆41Updated 2 years ago
- Code for the paper LeanReasoner: Boosting Complex Logical Reasoning with Lean: https://arxiv.org/pdf/2403.13312.pdf☆22Updated 11 months ago
- Tasks for describing differences between text distributions.☆16Updated 9 months ago
- Neural theorem proving tutorial, version II☆36Updated last year
- Pile Deduplication Code☆19Updated last year
- ☆17Updated 11 months ago
- The official repository for the paper Multilingual Mathematical Autoformalization☆35Updated 11 months ago
- ☆28Updated last year
- ☆75Updated last month
- ☆95Updated last year
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆21Updated 8 months ago
- Repository for Skill Set Optimization☆12Updated 9 months ago
- ☆14Updated last year
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)