WisdomShell/RewardAnything

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WisdomShell/RewardAnything)

WisdomShell / RewardAnything

RewardAnything: Generalizable Principle-Following Reward Models

☆44

Alternatives and similar repositories for RewardAnything

Users that are interested in RewardAnything are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kyegomez / SelfExtend
View on GitHub
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Nov 11, 2024Updated last year
prnake / kimi-deepresearch
View on GitHub
Kimi K2 Thinking Agentic Search Unofficial Implementation
☆15Nov 9, 2025Updated 8 months ago
FloyedShen / AntiSD
View on GitHub
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information
☆32May 14, 2026Updated 2 months ago
pppa2019 / swie_overmiss_llm4mt
View on GitHub
Code for "Improving Translation Faithfulness of Large Language Models via Augmenting Instructions"
☆12Aug 26, 2023Updated 2 years ago
freedomkk-qfeng / DeepSeek-ReAct-Native-example
View on GitHub
A Python example project showcasing the capabilities of **DeepSeek-V3.2** models combining "Thinking Mode" (Reasoning) with **Tool Callin…
☆19Jan 19, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RM-R1-UIUC / RM-R1
View on GitHub
[ICLR'26] RM-R1: Unleashing the Reasoning Potential of Reward Models
☆167Jun 26, 2025Updated last year
Adaxry / Post-Instruction
View on GitHub
☆21Sep 5, 2023Updated 2 years ago
Kangningthu / SUM
View on GitHub
Uncertainty-aware Fine-tuning of Segmentation Foundation Models (NeurIPS 2024).
☆16Jan 9, 2025Updated last year
AxelSorensenDev / Eevee
View on GitHub
An Easy Annotation Tool for Natural Language Processing
☆12May 17, 2024Updated 2 years ago
joykirat18 / How-To-Think-Step-by-Step
View on GitHub
How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning
☆26Aug 29, 2024Updated last year
OpenGVLab / VRBench
View on GitHub
[ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos
☆28Jun 4, 2026Updated last month
RedSearchAgent / DeepTraceHub
View on GitHub
RedSearcher's framework for deep search agent trajectory synthesis, QA filtering, and model evaluation, supporting ReACT and DeepSeek-sty…
☆23Feb 26, 2026Updated 4 months ago
miaoyuchun / InfoRM
View on GitHub
The official implementation of InfoRM [NeurIPS 2024].
☆16Oct 25, 2025Updated 8 months ago
lemon0830 / TIM
View on GitHub
code for Teaching LM to Translate with Comparison
☆39Dec 15, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
violetxi / ExpRL
View on GitHub
☆19Jun 16, 2026Updated last month
SalesforceAIResearch / PretrainRL-pipeline
View on GitHub
An automated data pipeline scaling RL to pretraining levels
☆76Jun 2, 2026Updated last month
wangpf3 / consistent-CoT-distillation
View on GitHub
☆45Aug 23, 2023Updated 2 years ago
UCSB-NLP-Chang / Prereq_tune
View on GitHub
Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"
☆11Jan 10, 2025Updated last year
kobayashikanna01 / Chain-of-Discussion
View on GitHub
☆11May 28, 2024Updated 2 years ago
FloyedShen / VESPO
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
JinaLeejnl / AlignX
View on GitHub
[ACL 2026] From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
☆38Jan 8, 2026Updated 6 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
shenao-zhang / reward-augmented-preference
View on GitHub
The official implementation of Preference Data Reward-Augmentation.
☆18May 1, 2025Updated last year
THUDM / Self-Contrast
View on GitHub
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
☆20Apr 2, 2024Updated 2 years ago
AntResearchNLP / AlignXplore
View on GitHub
Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals
☆11Jan 8, 2026Updated 6 months ago
YJiangcm / BMC
View on GitHub
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
☆12Jan 26, 2025Updated last year
zzz47zzz / CET
View on GitHub
[ACL2023] Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference
☆24Dec 25, 2023Updated 2 years ago
dextroushands / pretraind_model_for_nlp_tasks
View on GitHub
☆14Sep 19, 2022Updated 3 years ago
git-cloner / Llama2-chinese
View on GitHub
Llama2 chinese finetuning
☆38Aug 2, 2023Updated 2 years ago
google-research-datasets / recognizing-multimodal-entailment
View on GitHub
The dataset consists of public social media url pairs and the corresponding entailment label for an external conference (ACL 2021). Each …
☆14Aug 16, 2021Updated 4 years ago
lilakk / BLEUBERI
View on GitHub
Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"
☆32Jun 5, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
PlusLabNLP / Narrative-Discourse
View on GitHub
☆16Nov 5, 2024Updated last year
BohanLi0110 / NLP-DA-Papers
View on GitHub
☆26Nov 20, 2021Updated 4 years ago
falonss703 / Awesome-Uncertainty-based-Reinforcement-Learning
View on GitHub
🔥🔥🔥Latest Papers, Codes on Uncertainty-based RL
☆58Aug 24, 2025Updated 10 months ago
AIM3-RUC / VideoIC
View on GitHub
Danmuku dataset
☆12Jul 7, 2023Updated 3 years ago
RUCAIBox / RLMEC
View on GitHub
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆39Jan 12, 2024Updated 2 years ago
H-TayyarMadabushi / AStitchInLanguageModels
View on GitHub
Data and Baselines for AStitchInLanguageModels dataset
☆13Oct 31, 2022Updated 3 years ago
THUNLP-MT / PLM4MT
View on GitHub
Code for our work "MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators" in ACL 2022
☆20Mar 18, 2022Updated 4 years ago