1KE-JI / UPFTLinks
Official resources of "The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models", accepted by NeurIPS 2025
☆15Updated 6 months ago
Alternatives and similar repositories for UPFT
Users that are interested in UPFT are comparing it to the libraries listed below
Sorting:
- This is the repository of DEER, a Dynamic Early Exit in Reasoning method for Large Reasoning Language Models.☆176Updated 6 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆255Updated 4 months ago
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆142Updated 4 months ago
- ☆37Updated last year
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆143Updated last month
- 📜 Paper list on decoding methods for LLMs and LVLMs☆66Updated 2 months ago
- ☆52Updated 3 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆300Updated 2 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆93Updated last year
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆27Updated 7 months ago
- ☆179Updated last year
- Extrapolating RLVR to General Domains without Verifiers☆187Updated 4 months ago
- [EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆197Updated last month
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆96Updated 10 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆71Updated 9 months ago
- LLM hallucination paper list☆328Updated last year
- An awesome repository & A comprehensive survey on interpretability of LLM attention heads.☆392Updated 10 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆88Updated 10 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆73Updated 7 months ago
- This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vi…☆118Updated 6 months ago
- Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"☆47Updated 4 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆161Updated 7 months ago
- Test-time preferenece optimization (ICML 2025).☆177Updated 8 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆86Updated 10 months ago
- Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments☆194Updated 3 weeks ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆253Updated this week
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)☆73Updated 8 months ago
- The demo, code and data of FollowRAG☆75Updated 6 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆290Updated 2 weeks ago
- Implementation code for ACL2024:Advancing Parameter Efficiency in Fine-tuning via Representation Editing☆14Updated last year