ernie-research/Tool-Augmented-Reward-Model

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ernie-research/Tool-Augmented-Reward-Model)

ernie-research / Tool-Augmented-Reward-Model

[ICLR'24 spotlight] Tool-Augmented Reward Modeling

☆54

Alternatives and similar repositories for Tool-Augmented-Reward-Model

Users that are interested in Tool-Augmented-Reward-Model are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yuyq18 / StepTool
View on GitHub
☆36May 24, 2025Updated last year
SparkJiao / dpo-trajectory-reasoning
View on GitHub
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆84Jan 14, 2025Updated last year
facebookresearch / ToolVerifier
View on GitHub
This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.
☆23Mar 11, 2024Updated 2 years ago
lukasgarbas / can-we-tune-together
View on GitHub
Combining encoder-based language models
☆11Nov 11, 2021Updated 4 years ago
CriticBench / CriticBench
View on GitHub
[ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
☆31Mar 5, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xiusic / DecisionFlow
View on GitHub
☆34Aug 26, 2025Updated 11 months ago
NumberChiffre / mcts-llm
View on GitHub
☆98Dec 16, 2024Updated last year
Freder-chen / ReasonGenRM
View on GitHub
A simple implementation of ReasonGenRM.
☆19Apr 21, 2025Updated last year
hannahxchen / automatic-paraphrase-dataset-augmentation
View on GitHub
Code and data for automatic paraphrase dataset augmentation.
☆11Mar 8, 2021Updated 5 years ago
pkshashank / GFLeanTransfer
View on GitHub
☆14Mar 27, 2024Updated 2 years ago
fzyzcjy / ai_math_paper_list
View on GitHub
AI for Mathematics Paper List
☆17Jan 14, 2025Updated last year
koalazf99 / tacube
View on GitHub
[EMNLP 2022] TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data
☆17May 17, 2023Updated 3 years ago
lipiji / data-summ-cnn_dailymail
View on GitHub
non-anonymized cnn/dailymail dataset for text summarization
☆13May 30, 2017Updated 9 years ago
seraphlabs-ca / SentenceMIM-demo
View on GitHub
This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"
☆28Jun 22, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
icip-cas / Verifier-Engineering
View on GitHub
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆63Dec 5, 2024Updated last year
NonvolatileMemory / flash_attn_gqa
View on GitHub
triton ver of gqa flash attn, based on the tutorial
☆12Aug 4, 2024Updated last year
mhxu1998 / FlexCare
View on GitHub
KDD 2024 | FlexCare: Leveraging Cross-Task Synergy for Flexible Multimodal Healthcare Prediction
☆18Sep 4, 2024Updated last year
jind11 / DAMT
View on GitHub
Semi-supervised Domain Adaptation of Machine Translation
☆12Dec 8, 2022Updated 3 years ago
zjunlp / TRICE
View on GitHub
[NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback
☆43Mar 14, 2024Updated 2 years ago
vicgalle / refined-dpo
View on GitHub
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
☆13Feb 13, 2024Updated 2 years ago
MathAutoTag / mathdata
View on GitHub
K12高中数学试题数据集
☆18Aug 16, 2023Updated 2 years ago
zhaochen0110 / LMLM
View on GitHub
Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)
☆17Dec 8, 2022Updated 3 years ago
ritaranx / Collab-RAG
View on GitHub
☆30Apr 8, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
rocq-community / apery
View on GitHub
A formal proof of the irrationality of zeta(3), the Apéry constant [maintainer=@amahboubi,@pi8027]
☆26May 28, 2026Updated last month
hsing-wang / WMT2020_BioMedical
View on GitHub
☆15Jul 16, 2021Updated 5 years ago
THU-KEG / RM-Bench
View on GitHub
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆84Jul 18, 2025Updated last year
kaistAI / KtrlF
View on GitHub
[NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"
☆23Oct 11, 2024Updated last year
hhan1018 / NesTools
View on GitHub
[COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
☆18Jan 18, 2025Updated last year
shijunbao / prompt-manager
View on GitHub
集中管理所有的prompt。
☆14Nov 27, 2024Updated last year
lipiji / uChecker
View on GitHub
Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"
☆19Aug 17, 2022Updated 3 years ago
RUCAIBox / EASYEP
View on GitHub
☆29Apr 14, 2025Updated last year
zhaochen0110 / Cotempqa
View on GitHub
Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)
☆31Jul 3, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
kangluoyao / VAP_Former
View on GitHub
[MICCAI-2023]Visual-Attribute Prompt Learning for Progressive Mild Cognitive Impairment Prediction
☆15Dec 12, 2023Updated 2 years ago
ruc-ai4math / LeanStateSearch
View on GitHub
☆19Apr 5, 2025Updated last year
wangyu-ustc / LargeScaleWashing
View on GitHub
The official implementation of the paper "Large Scale Knowledge Washing"
☆10Jun 12, 2024Updated 2 years ago
zhulishe / Quantitative-investment
View on GitHub
Use strategy in stock transaction for high revenue.
☆10Dec 24, 2015Updated 10 years ago
conceptmath / conceptmath
View on GitHub
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large …
☆26May 29, 2024Updated 2 years ago
hrwise-nlp / ToolsMeetLLMs
View on GitHub
☆33May 8, 2025Updated last year
holarissun / RewardModelingBeyondBradleyTerry
View on GitHub
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆73Apr 2, 2025Updated last year