SidU / MathBlackBox
☆11Updated 9 months ago
Alternatives and similar repositories for MathBlackBox:
Users that are interested in MathBlackBox are comparing it to the libraries listed below
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated last month
- ☆48Updated 5 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆90Updated 3 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆60Updated last month
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated 10 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- A repository for research on medium sized language models.☆76Updated 11 months ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆13Updated 2 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 8 months ago
- ☆16Updated 2 months ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 4 months ago
- ☆24Updated 7 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆85Updated last month
- Simple repository for training small reasoning models☆27Updated 2 months ago
- ☆142Updated last year
- ☆50Updated 5 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆65Updated last month
- ☆20Updated 4 months ago
- ☆41Updated 4 months ago
- ☆18Updated 7 months ago
- ☆114Updated 2 months ago
- ☆63Updated last month
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆33Updated 6 months ago
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆12Updated last week
- Toy implementation of Strawberry☆31Updated 7 months ago
- ☆27Updated last month
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆96Updated 6 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year