SidU / MathBlackBoxLinks
☆11Updated 10 months ago
Alternatives and similar repositories for MathBlackBox
Users that are interested in MathBlackBox are comparing it to the libraries listed below
Sorting:
- ☆64Updated 2 months ago
- ☆19Updated 3 weeks ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆15Updated 2 weeks ago
- ☆24Updated 8 months ago
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆20Updated last month
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆88Updated last week
- ☆38Updated this week
- Challenges for general-purpose web-browsing AI agents☆58Updated this week
- ☆9Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆42Updated this week
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆17Updated 3 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆60Updated last month
- A testbed for agents and environments that can automatically improve models through data generation.☆24Updated 2 months ago
- ☆49Updated 6 months ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Updated 3 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 6 months ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 5 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆58Updated 3 months ago
- ☆19Updated this week
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated 11 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆53Updated last month
- ☆40Updated 10 months ago
- ☆27Updated this week
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated 11 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆24Updated last month