sastpg / RFTT
RFTT: Reasoning with Reinforced Functional Token Tuning
☆22Updated last week
Alternatives and similar repositories for RFTT:
Users that are interested in RFTT are comparing it to the libraries listed below
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆64Updated last week
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆78Updated 2 weeks ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆162Updated last week
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆64Updated last month
- Paper List of Inference/Test Time Scaling/Computing☆131Updated this week
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆104Updated last week
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆167Updated 2 months ago
- ☆49Updated last month
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆56Updated last month
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆50Updated 2 weeks ago
- ☆28Updated 6 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆43Updated 2 weeks ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆70Updated 2 weeks ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆103Updated 2 weeks ago
- ☆41Updated 5 months ago
- ☆43Updated 5 months ago
- A Self-Training Framework for Vision-Language Reasoning☆73Updated 2 months ago
- SOTA RL fine-tuning solution for advanced math reasoning of LLM☆92Updated this week
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆62Updated last month
- ☆129Updated this week
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆67Updated this week
- ☆59Updated this week
- Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆118Updated this week
- ☆59Updated 3 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆33Updated 2 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆138Updated 2 weeks ago
- ☆138Updated 2 weeks ago
- ☆48Updated last month
- The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark"☆47Updated this week