feyzaakyurek/rl4f

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/feyzaakyurek/rl4f)

feyzaakyurek / rl4f

Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.

☆63

Alternatives and similar repositories for rl4f

Users that are interested in rl4f are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Shentao-YANG / Preference_Grounded_Guidance
View on GitHub
Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).
☆17Jan 8, 2025Updated last year
UKPLab / on-emergence
View on GitHub
Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning
☆33Jan 9, 2025Updated last year
amazon-science / faithful-summarization-generation
View on GitHub
☆16Mar 27, 2023Updated 3 years ago
allenai / few_shot_explanations
View on GitHub
Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"
☆29Apr 28, 2023Updated 3 years ago
tianjunz / HIR
View on GitHub
☆157Mar 18, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
joyheyueya / declarative-math-word-problem
View on GitHub
☆51Aug 29, 2023Updated 2 years ago
yulonghui / MOCA
View on GitHub
Official implementation of "Continual Learning by Modeling Intra-Class Variation" (MOCA). [TMLR 2023]
☆16Mar 3, 2023Updated 3 years ago
allenai / FineGrainedRLHF
View on GitHub
☆283Jan 6, 2025Updated last year
moqingyan / dsr-lm
View on GitHub
☆13Jul 8, 2023Updated 3 years ago
wenhuchen / TheoremQA
View on GitHub
The dataset and code for paper: TheoremQA: A Theorem-driven Question Answering dataset
☆161Apr 23, 2024Updated 2 years ago
zzzace2000 / robust_cls_model
View on GitHub
The code to reproduce CVPR 2021 paper "Towards Robust Classification Model by Counterfactual and Invariant Data Generation"
☆16Jul 29, 2021Updated 4 years ago
Steven-Ho / VALOR
View on GitHub
Implementation of VALOR (Variational Option Discovery Algorithms)
☆10Jun 28, 2019Updated 7 years ago
HKUST-KnowComp / SubeventWriter
View on GitHub
Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…
☆11Oct 16, 2022Updated 3 years ago
kamigaito / SLAHAN
View on GitHub
SLAHAN is an implementation of Kamigaito et al., 2020, "Syntactically Look-A-Head Attention Network for Sentence Compression", In Proc. o…
☆17Jan 27, 2021Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
allenai / feb
View on GitHub
Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"
☆12Apr 27, 2022Updated 4 years ago
agi-templar / Stable-Alignment
View on GitHub
Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…
☆356Jun 18, 2023Updated 3 years ago
wiio12 / POETRY
View on GitHub
Code for the paper: Proving Theorems Recursively
☆12May 23, 2024Updated 2 years ago
HITsz-TMG / ICL-State-Vector
View on GitHub
☆12Jul 4, 2024Updated 2 years ago
gsbDBI / contextual_bandits_evaluation
View on GitHub
Offline Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
☆11Oct 21, 2024Updated last year
FranxYao / GPT-Bargaining
View on GitHub
Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback
☆207May 24, 2023Updated 3 years ago
MARIO-Math-Reasoning / Super_MARIO
View on GitHub
☆341Jun 5, 2025Updated last year
ryokamoi / llm-self-correction-papers
View on GitHub
List of papers on Self-Correction of LLMs.
☆82May 19, 2026Updated 2 months ago
LZhengisme / self-infilling
View on GitHub
[ICML 2024] Self-Infilling Code Generation
☆18May 5, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Baichenjia / Contrastive-UCB
View on GitHub
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
☆12Jun 16, 2022Updated 4 years ago
chujiezheng / LLM-Extrapolation
View on GitHub
Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"
☆75May 20, 2025Updated last year
DavidFanzz / llm_decoding
View on GitHub
☆12Apr 25, 2025Updated last year
clinicalml / cotrain-prompting
View on GitHub
Code for co-training large language models (e.g. T0) with smaller ones (e.g. BERT) to boost few-shot performance
☆16Sep 23, 2022Updated 3 years ago
GeorgeLuImmortal / PUnifiedNER
View on GitHub
☆25Jun 5, 2023Updated 3 years ago
chuanyang-Zheng / Progressive-Hint
View on GitHub
This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"
☆208Oct 11, 2023Updated 2 years ago
fakenewsresearch / dataset
View on GitHub
☆26Jun 14, 2024Updated 2 years ago
Dahoas / QDSyntheticData
View on GitHub
☆14Aug 15, 2024Updated last year
allenai / RL4LMs
View on GitHub
A modular RL library to fine-tune language models to human preferences
☆2,393Mar 1, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
GAIR-NLP / auto-j
View on GitHub
Generative Judge for Evaluating Alignment
☆251Jan 18, 2024Updated 2 years ago
tatsu-lab / mlm_inductive_bias
View on GitHub
Code Release for "On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies"
☆16Apr 13, 2021Updated 5 years ago
OSU-NLP-Group / AttrScore
View on GitHub
Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
☆56Jul 3, 2023Updated 3 years ago
apple / ml-reversal-blessing
View on GitHub
☆17Jul 31, 2025Updated 11 months ago
vipulraheja / iterater
View on GitHub
Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)
☆83Nov 15, 2023Updated 2 years ago
Hritikbansal / jpo
View on GitHub
☆13Jul 2, 2025Updated last year
yifanzhang-pro / AutoMathText
View on GitHub
[ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (https://huggingface.co/papers…
☆92Nov 23, 2025Updated 8 months ago