MingLiiii / Layer_Gradient
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆63Updated 2 months ago
Alternatives and similar repositories for Layer_Gradient:
Users that are interested in Layer_Gradient are comparing it to the libraries listed below
- ☆95Updated last month
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆76Updated 3 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆118Updated last month
- Critique-out-Loud Reward Models☆63Updated 6 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆94Updated 3 weeks ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆41Updated 2 weeks ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆177Updated last month
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆70Updated 2 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆69Updated last month
- Large Language Models Can Self-Improve in Long-context Reasoning☆69Updated 5 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆93Updated last month
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆105Updated 2 months ago
- ☆97Updated 10 months ago
- ☆72Updated 5 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆67Updated 6 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆35Updated 2 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆57Updated 3 weeks ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆109Updated last year
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆60Updated 4 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆145Updated last month
- The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"☆53Updated last year
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆42Updated 5 months ago
- ☆24Updated 2 weeks ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 5 months ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆44Updated 3 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Updated 11 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- ☆59Updated 8 months ago
- ☆76Updated last week