TergelMunkhbat / concise-reasoning
Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models
☆29Updated 2 weeks ago
Alternatives and similar repositories for concise-reasoning:
Users that are interested in concise-reasoning are comparing it to the libraries listed below
- ☆25Updated 7 months ago
- ☆27Updated 3 weeks ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 8 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- ☆63Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- ☆26Updated last month
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆96Updated 6 months ago
- ☆50Updated 5 months ago
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆56Updated 3 weeks ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆53Updated last month
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆112Updated last week
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆45Updated 5 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆30Updated 2 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- General Reasoner: Advancing LLM Reasoning Across All Domains☆77Updated this week
- ☆27Updated last month
- ☆114Updated 2 months ago
- ☆45Updated last month
- ☆22Updated 4 months ago
- ☆109Updated 3 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 4 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆85Updated last month
- ☆92Updated 3 months ago
- ☆48Updated 6 months ago
- Implementation of "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models"☆27Updated 2 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆88Updated last month
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆25Updated last month
- ☆47Updated 4 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆57Updated 3 weeks ago