☆52Feb 12, 2025Updated last year
Alternatives and similar repositories for Self-Backtracking
Users that are interested in Self-Backtracking are comparing it to the libraries listed below
Sorting:
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆35Jul 16, 2025Updated 7 months ago
- ☆46Jun 11, 2025Updated 8 months ago
- ☆46Sep 27, 2025Updated 5 months ago
- ☆145Sep 12, 2025Updated 5 months ago
- Library to facilitate pruning of LLMs based on context☆32Jan 31, 2024Updated 2 years ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆74Apr 22, 2025Updated 10 months ago
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆21Feb 17, 2025Updated last year
- ☆32Oct 13, 2025Updated 4 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆37Jan 21, 2025Updated last year
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Feb 21, 2025Updated last year
- Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration☆29Nov 22, 2025Updated 3 months ago
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆27Oct 3, 2025Updated 4 months ago
- ☆14Apr 14, 2025Updated 10 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- ☆33Nov 18, 2025Updated 3 months ago
- ☆14Jan 24, 2025Updated last year
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 5 months ago
- ☆13May 21, 2023Updated 2 years ago
- ☆25Apr 10, 2025Updated 10 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆28Feb 17, 2025Updated last year
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆29Jun 4, 2024Updated last year
- ☆60Jan 12, 2026Updated last month
- Large Language Models Can Self-Improve in Long-context Reasoning☆72Nov 24, 2024Updated last year
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆125Jun 11, 2025Updated 8 months ago
- A Sober Look at Language Model Reasoning☆93Nov 18, 2025Updated 3 months ago
- ☆56Mar 6, 2025Updated 11 months ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 9 months ago
- Official repository for ALT (ALignment with Textual feedback).☆10Jul 25, 2024Updated last year
- Code and data release of the paper Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows☆14Oct 4, 2024Updated last year
- ☆13Jan 22, 2025Updated last year
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- ☆21Oct 22, 2025Updated 4 months ago
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning☆13Sep 2, 2024Updated last year
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- ☆34Jan 25, 2026Updated last month
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 7 months ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆31Oct 24, 2024Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year