Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.
☆64Nov 27, 2024Updated last year
Alternatives and similar repositories for rl4f
Users that are interested in rl4f are comparing it to the libraries listed below
Sorting:
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆17Jan 8, 2025Updated last year
- ☆49Aug 29, 2023Updated 2 years ago
- The code to reproduce CVPR 2021 paper "Towards Robust Classification Model by Counterfactual and Invariant Data Generation"☆17Jul 29, 2021Updated 4 years ago
- Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"☆31Apr 28, 2023Updated 2 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- Explaining neural decisions contrastively to alternative decisions.☆25Mar 18, 2021Updated 4 years ago
- ☆282Jan 6, 2025Updated last year
- ☆26May 30, 2023Updated 2 years ago
- ☆12Jul 4, 2024Updated last year
- ☆30Jun 19, 2023Updated 2 years ago
- Offline Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits☆10Oct 21, 2024Updated last year
- ☆11Mar 13, 2023Updated 2 years ago
- [ICLR 2023] PyTorch code of Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees☆24Jun 19, 2023Updated 2 years ago
- ☆78Jun 20, 2025Updated 8 months ago
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆354Jun 18, 2023Updated 2 years ago
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- ☆19Jul 31, 2025Updated 7 months ago
- ☆13Jul 2, 2025Updated 8 months ago
- Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning☆11Jun 16, 2022Updated 3 years ago
- ☆12Jul 8, 2023Updated 2 years ago
- ☆12Jan 25, 2024Updated 2 years ago
- Code for the paper: Proving Theorems Recursively☆12May 23, 2024Updated last year
- a sketch-based system for semantic parsing☆10Nov 21, 2022Updated 3 years ago
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Oct 11, 2023Updated 2 years ago
- ☆50Oct 24, 2023Updated 2 years ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Jul 3, 2023Updated 2 years ago
- ☆158Mar 18, 2023Updated 2 years ago
- Accompanying code for our NeurIPS 2019 paper☆12Nov 7, 2019Updated 6 years ago
- ☆16Mar 27, 2023Updated 2 years ago
- ☆12Aug 15, 2022Updated 3 years ago
- “Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition” (EMNLP 2022)☆16Feb 2, 2023Updated 3 years ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆90Nov 23, 2025Updated 3 months ago
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach☆32Nov 6, 2023Updated 2 years ago
- ☆342Jun 5, 2025Updated 8 months ago
- Code Release for "On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies"☆16Apr 13, 2021Updated 4 years ago
- Scripts for pushing models to huggingface repos☆15Sep 11, 2025Updated 5 months ago
- ☆15Jul 9, 2025Updated 7 months ago
- A weak supervision framework for (partial) labeling functions☆16Jul 15, 2024Updated last year
- Implementation for the paper "Learning Invariant Representation for Continual Learning" in PyTorch.☆12Jan 31, 2021Updated 5 years ago