SidU / MathBlackBoxLinks
☆11Updated 11 months ago
Alternatives and similar repositories for MathBlackBox
Users that are interested in MathBlackBox are comparing it to the libraries listed below
Sorting:
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated last week
- ☆10Updated last month
- ☆20Updated last week
- ☆24Updated 9 months ago
- ☆65Updated 2 months ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Updated 3 months ago
- ☆16Updated 3 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 7 months ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 5 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆68Updated 3 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆18Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- The original Shared Recurrent Memory Transformer implementation☆27Updated 2 weeks ago
- Challenges for general-purpose web-browsing AI agents☆58Updated 3 weeks ago
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆22Updated 2 months ago
- ☆41Updated 6 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆38Updated 3 months ago
- Simple repository for training small reasoning models☆33Updated 4 months ago
- Very minimal (and stateless) agent framework☆44Updated 5 months ago
- ☆48Updated last week
- ☆19Updated 3 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆95Updated 2 weeks ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆57Updated 2 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆42Updated this week
- Verifiers for LLM Reinforcement Learning☆60Updated 2 months ago
- ☆51Updated 7 months ago
- ☆32Updated 5 months ago
- ☆115Updated 4 months ago