TergelMunkhbat / concise-reasoningLinks
Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models
☆34Updated 2 months ago
Alternatives and similar repositories for concise-reasoning
Users that are interested in concise-reasoning are comparing it to the libraries listed below
Sorting:
- Verifiers for LLM Reinforcement Learning☆60Updated 2 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆93Updated 2 weeks ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆18Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆24Updated 9 months ago
- ☆32Updated last month
- official implementation of paper "Process Reward Model with Q-value Rankings"☆59Updated 4 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 6 months ago
- ☆35Updated 3 weeks ago
- ☆48Updated 2 weeks ago
- A repository for research on medium sized language models.☆76Updated last year
- ☆45Updated last month
- Official Code Release for "Training a Generally Curious Agent"☆25Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆36Updated 8 months ago
- ☆20Updated last week
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆90Updated last month
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Updated 4 months ago
- Process Reward Models That Think☆41Updated 3 weeks ago
- Official repo of paper LM2☆41Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 9 months ago
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆54Updated 8 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆57Updated 2 months ago
- ☆47Updated 3 weeks ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆108Updated 8 months ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆52Updated 3 weeks ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆91Updated 2 months ago
- Complex Function Calling Benchmark.☆114Updated 5 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆122Updated this week
- ☆13Updated 6 months ago