collinskatie / checkmate
☆45Updated 6 months ago
Alternatives and similar repositories for checkmate:
Users that are interested in checkmate are comparing it to the libraries listed below
- Harmonic Datasets☆37Updated 8 months ago
- LLMs + Lean, on your laptop or in the cloud☆139Updated 5 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- Evaluation of neuro-symbolic engines☆35Updated 7 months ago
- ☆23Updated 3 weeks ago
- This is the official repository for all the code of TheoremLlama☆39Updated 5 months ago
- The official repository for the paper Multilingual Mathematical Autoformalization☆34Updated 10 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆142Updated last month
- ☆13Updated 7 months ago
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆36Updated 2 years ago
- LILO: Library Induction with Language Observations☆85Updated 7 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated 9 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆80Updated this week
- ☆36Updated 6 months ago
- ☆75Updated this week
- Neural theorem proving tutorial, version II☆34Updated 11 months ago
- LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management☆59Updated 2 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆32Updated 5 months ago
- Official code for paper: INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving☆39Updated 2 years ago
- ☆26Updated last year
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- Code for the paper LeanReasoner: Boosting Complex Logical Reasoning with Lean: https://arxiv.org/pdf/2403.13312.pdf☆22Updated 10 months ago
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆35Updated last year
- Benchmark for undergraduate-level formal mathematics☆104Updated 5 months ago
- Minimum Description Length probing for neural network representations☆19Updated 2 months ago
- Functional Benchmarks and the Reasoning Gap☆84Updated 5 months ago
- Advanced Reasoning Benchmark Dataset for LLMs☆45Updated last year
- gzip Predicts Data-dependent Scaling Laws☆34Updated 10 months ago
- An updated version of miniF2F with lots of fixes and informal statements / solutions.☆78Updated 2 months ago
- Can Language Models Solve Olympiad Programming?☆112Updated 2 months ago