collinskatie / checkmate
☆45Updated 5 months ago
Alternatives and similar repositories for checkmate:
Users that are interested in checkmate are comparing it to the libraries listed below
- Harmonic Datasets☆36Updated 7 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- This is the official repository for all the code of TheoremLlama☆38Updated 4 months ago
- ☆31Updated 5 months ago
- Code for the paper LeanReasoner: Boosting Complex Logical Reasoning with Lean: https://arxiv.org/pdf/2403.13312.pdf☆20Updated 8 months ago
- Evaluation of neuro-symbolic engines☆34Updated 6 months ago
- ☆28Updated last year
- LLMs + Lean, on your laptop or in the cloud☆134Updated 3 months ago
- ☆13Updated 6 months ago
- ☆21Updated last year
- Llemma formal2formal (tactic prediction) theorem proving experiments☆19Updated last year
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆32Updated 4 months ago
- Can Language Models Solve Olympiad Programming?☆110Updated last month
- ☆74Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆137Updated 2 weeks ago
- The official repository for the paper Multilingual Mathematical Autoformalization☆33Updated 9 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated 8 months ago
- ☆50Updated 4 months ago
- Get language models to generate responses in a specific format reliably. Open source implementation of Synchromesh: Reliable code generat…☆27Updated 11 months ago
- Minimum Description Length probing for neural network representations☆18Updated 3 weeks ago
- Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Langu…☆39Updated last year
- ☆40Updated last week
- A testbed for agents and environments that can automatically improve models through data generation.☆18Updated 2 months ago
- A benchmark that challenges language models to code solutions for scientific problems☆108Updated this week
- LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management☆59Updated last month
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- LILO: Library Induction with Language Observations☆83Updated 5 months ago
- ☆26Updated last year