SidU / MathBlackBoxLinks
☆11Updated 11 months ago
Alternatives and similar repositories for MathBlackBox
Users that are interested in MathBlackBox are comparing it to the libraries listed below
Sorting:
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆32Updated 3 months ago
- ☆50Updated 2 weeks ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 6 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆39Updated 8 months ago
- ☆10Updated 2 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 5 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆60Updated 3 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆71Updated 3 months ago
- ☆64Updated last month
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated 3 months ago
- The original Shared Recurrent Memory Transformer implementation☆27Updated last month
- ☆23Updated 3 weeks ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆35Updated last year
- ☆66Updated 3 months ago
- ☆19Updated 4 months ago
- Verifiers for LLM Reinforcement Learning☆64Updated 3 months ago
- ☆52Updated 8 months ago
- Very minimal (and stateless) agent framework☆44Updated 6 months ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆24Updated 3 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆42Updated 2 months ago
- ☆40Updated 7 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆20Updated 2 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆60Updated 4 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆95Updated last month
- entropix style sampling + GUI☆26Updated 8 months ago
- Source code and utilities for the Genesys distributed language model architecture discovery system.☆40Updated 2 weeks ago
- ☆23Updated 2 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆58Updated 7 months ago
- Simple GRPO scripts and configurations.☆59Updated 5 months ago