spcl / x1
Official Implementation of "Reasoning Language Models: A Blueprint"
☆57Updated 2 months ago
Alternatives and similar repositories for x1:
Users that are interested in x1 are comparing it to the libraries listed below
- Code implementation of synthetic continued pretraining☆104Updated 3 months ago
- ☆57Updated last month
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆139Updated this week
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆94Updated last week
- ☆46Updated last month
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆86Updated last month
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Repo for "Z1: Efficient Test-time Scaling with Code"☆55Updated 2 weeks ago
- ☆149Updated 4 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- Codebase accompanying the Summary of a Haystack paper.☆77Updated 7 months ago
- ☆101Updated 4 months ago
- ☆107Updated 3 months ago
- Reformatted Alignment☆115Updated 7 months ago
- This the implementation of LeCo☆32Updated 3 months ago
- ☆91Updated 2 months ago
- ☆33Updated last month
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆33Updated 4 months ago
- Exploration of automated dataset selection approaches at large scales.☆39Updated last month
- ☆55Updated 2 weeks ago
- ☆125Updated 3 weeks ago
- Official implementation of paper "Autonomous Data Selection with Language Models for Mathematical Texts" (As Huggingface Daily Papers: ht…☆80Updated 5 months ago
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.☆54Updated 6 months ago
- ☆32Updated 2 weeks ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆47Updated 3 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆56Updated 5 months ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆30Updated 10 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated last month
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆53Updated 6 months ago
- o1 Chain of Thought Examples☆33Updated 6 months ago