weirayao / RetroformerLinks

☆35

Alternatives and similar repositories for Retroformer

Users that are interested in Retroformer are comparing it to the libraries listed below

Sorting:

WeiminXiong / IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆62Updated last year
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆130Updated 6 months ago
Yifan-Song793 / ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆151Updated 11 months ago
zhiyuanhubj / UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
☆102Updated last year
PKU-Alignment / aligner
[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct
☆186Updated 9 months ago
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆123Updated last year
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆89Updated last year
Vance0124 / Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
☆148Updated 8 months ago
TianHongZXY / RLVR-Decomposed
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆112Updated last month
YuxiXie / SelfEval-Guided-Decoding
☆103Updated last year
JoeYing1019 / UltraTool
[ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
☆65Updated 2 months ago
WooooDyy / LLM-Reverse-Curriculum-RL
Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…
☆111Updated last year
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆164Updated 7 months ago
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆61Updated 10 months ago
Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆29Updated last year
kanishkg / cognitive-behaviors
☆208Updated 6 months ago
mandyyyyii / scibench
☆128Updated last year
joeljang / RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
☆108Updated last year
SALT-NLP / DyLAN
Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization
☆170Updated last year
Ber666 / ToolkenGPT
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)
☆262Updated last year
ZitongYang / Synthetic_Continued_Pretraining
Code implementation of synthetic continued pretraining
☆135Updated 9 months ago
BunsenFeng / AbstainQA
AbstainQA, ACL 2024
☆28Updated last year
zhangxjohn / LLM-Agent-Benchmark-List
A banchmark list for evaluation of large language models.
☆144Updated last month
SparkJiao / dpo-trajectory-reasoning
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆82Updated 9 months ago
genrm-star / genrm-critiques
GenRM-CoT: Data release for verification rationales
☆66Updated last year
lfy79001 / Awesome-Table-QA
A comprehensive paper list of Table-based Question Answering.
☆36Updated 2 years ago
rookie-joe / AutoPSV
☆50Updated 11 months ago
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆64Updated 8 months ago
xufangzhi / ENVISIONS
[ACL 2025] A Neural-Symbolic Self-Training Framework
☆115Updated 4 months ago
QwenLM / ProcessBench
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆174Updated 4 months ago