Mr-Tieguigui / LLM-Post-TrainingLinks
☆62Updated 4 months ago
Alternatives and similar repositories for LLM-Post-Training
Users that are interested in LLM-Post-Training are comparing it to the libraries listed below
Sorting:
- ☆100Updated 4 months ago
- ☆34Updated last month
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆95Updated 9 months ago
- ☆125Updated 6 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆87Updated 10 months ago
- ☆53Updated 7 months ago
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆67Updated this week
- Scaling Preference Data Curation via Human-AI Synergy☆108Updated 2 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆40Updated 6 months ago
- ☆130Updated 3 weeks ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆134Updated 3 months ago
- ☆154Updated 3 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆100Updated 5 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆62Updated 9 months ago
- This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.☆129Updated last year
- ☆67Updated 3 months ago
- Fantastic Data Engineering for Large Language Models☆91Updated 8 months ago
- ☆49Updated 6 months ago
- Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context☆36Updated last year
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆56Updated 3 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆71Updated last week
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?☆17Updated 6 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆88Updated 5 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆64Updated 5 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆55Updated 7 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆115Updated 4 months ago
- ☆46Updated 2 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆53Updated 3 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆141Updated 5 months ago