TergelMunkhbat / concise-reasoning
Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models
☆18Updated 2 weeks ago
Alternatives and similar repositories for concise-reasoning:
Users that are interested in concise-reasoning are comparing it to the libraries listed below
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 6 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 7 months ago
- We study toy models of skill learning.☆24Updated 2 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- ☆35Updated 3 weeks ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆26Updated 2 weeks ago
- A repository for research on medium sized language models.☆76Updated 10 months ago
- ☆13Updated 3 months ago
- MEXMA: Token-level objectives improve sentence representations☆41Updated 2 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆32Updated 5 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated last month
- ☆15Updated 8 months ago
- ☆24Updated 6 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆49Updated 3 months ago
- ☆48Updated 4 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆28Updated last month
- ☆16Updated 2 months ago
- ☆60Updated last month
- Exploration of automated dataset selection approaches at large scales.☆33Updated 3 weeks ago
- Train, tune, and infer Bamba model☆86Updated 2 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆23Updated last month
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆53Updated 5 months ago
- ☆47Updated 6 months ago
- ☆32Updated 9 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆33Updated last year
- ☆36Updated 6 months ago
- Official Code Release for "Training a Generally Curious Agent"☆19Updated 2 weeks ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 3 months ago
- ☆32Updated 3 weeks ago