QwenLM / Self-Lengthen
☆82Updated 4 months ago
Alternatives and similar repositories for Self-Lengthen:
Users that are interested in Self-Lengthen are comparing it to the libraries listed below
- Reformatted Alignment☆115Updated 6 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆84Updated last month
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆64Updated last week
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆46Updated 9 months ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆60Updated 5 months ago
- [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆70Updated 4 months ago
- [Preprint] An inference-time decoding strategy with adaptive foresight sampling☆86Updated last week
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆131Updated last month
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆95Updated 2 months ago
- ☆52Updated 3 weeks ago
- Code implementation of synthetic continued pretraining☆97Updated 2 months ago
- The official repository of the Omni-MATH benchmark.☆78Updated 3 months ago
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆97Updated 3 weeks ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆107Updated last week
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆58Updated 5 months ago
- LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆29Updated 11 months ago
- ☆101Updated 3 months ago
- ☆35Updated 2 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆73Updated 9 months ago
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆80Updated last month
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆162Updated 2 weeks ago
- ☆76Updated 2 months ago
- Codebase for Instruction Following without Instruction Tuning☆33Updated 6 months ago
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆75Updated last year
- ☆34Updated 3 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆24Updated last month
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆67Updated last month
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆75Updated 2 months ago
- The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…☆46Updated 4 months ago
- ☆49Updated last year