facebookresearch / polymathLinks
AI Agent leveraging symbolic reasoning and other auxiliary tools to boost its capabilities on various logic and reasoning benchmarks. This project aims to develop a robust and flexible AI system that can tackle complex problems in areas such as decision-making, mathematics, and programming.
☆39Updated 5 months ago
Alternatives and similar repositories for polymath
Users that are interested in polymath are comparing it to the libraries listed below
Sorting:
- LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management☆75Updated last year
- Large language models designed for formal theorem proving through tool-integrated reasoning.☆31Updated 5 months ago
- ☆83Updated last year
- ☆408Updated last month
- ☆148Updated this week
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆39Updated 2 months ago
- This is the official repository for all the code of TheoremLlama☆47Updated 6 months ago
- ☆42Updated last year
- ☆191Updated 2 weeks ago
- BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated c…☆40Updated 9 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Updated 5 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.☆69Updated 7 months ago
- AI-Driven Research Systems (ADRS)☆117Updated last month
- ☆42Updated last year
- LIMI: Less is More for Agency☆160Updated 3 months ago
- Open-source release accompanying Gao et al. 2025☆501Updated last month
- Multi-Granularity LLM Debugger [ICSE2026]☆95Updated 7 months ago
- ☆258Updated last month
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆128Updated 3 months ago
- ☆76Updated 3 weeks ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆132Updated last year
- [ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective☆231Updated last week
- Universal Reasoning Model☆122Updated 3 weeks ago
- ☆225Updated 10 months ago
- ☆100Updated this week
- ☆21Updated 6 months ago
- A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.☆103Updated this week
- The official implementation of Cross-Task Experience Sharing (COPS)☆29Updated last year
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆102Updated 5 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆131Updated last year