mingkaid / rl-promptLinks

Accompanying repo for the RLPrompt paper

☆358

Alternatives and similar repositories for rl-prompt

Users that are interested in rl-prompt are comparing it to the libraries listed below

Sorting:

kojima-takeshi188 / zero_shot_cot
Prod Env
☆434Updated 2 years ago
eric-mitchell / mend
MEND: Fast Model Editing at Scale
☆253Updated 2 years ago
allenai / FineGrainedRLHF
☆280Updated 10 months ago
glgh / awesome-llm-human-preference-datasets
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
☆384Updated 2 years ago
RUCAIBox / HaluEval
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆532Updated last year
jeffhj / LM-reasoning
This repository contains a collection of papers and resources on Reasoning in Large Language Models.
☆565Updated 2 years ago
jinlanfu / GPTScore
Source Code of Paper "GPTScore: Evaluate as You Desire"
☆257Updated 2 years ago
Alrope123 / rethinking-demonstrations
☆177Updated last year
shmsw25 / FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…
☆406Updated 7 months ago
shizhediao / active-prompt
Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"
☆247Updated last year
xingyaoww / mint-bench
Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…
☆134Updated last year
veronica320 / Faithful-COT
Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".
☆164Updated last year
teacherpeterpan / self-correction-llm-papers
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
☆558Updated last year
likenneth / honest_llama
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
☆556Updated 10 months ago
anthropics / ConstitutionalHarmlessnessPaper
☆248Updated 2 years ago
agi-templar / Stable-Alignment
Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…
☆354Updated 2 years ago
txsun1997 / Black-Box-Tuning
ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'2022: BBTv2: Towards a Gradient-Free Future with Large Language Model…
☆272Updated 3 years ago
facebookresearch / MetaICL
An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi
☆271Updated 2 years ago
AI21Labs / in-context-ralm
☆294Updated last year
mkshing / Prompt-Tuning
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"
☆168Updated 4 years ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆227Updated 2 years ago
TIGER-AI-Lab / Program-of-Thoughts
Data and Code for Program of Thoughts [TMLR 2023]
☆292Updated last year
evandez / REMEDI
Inspecting and Editing Knowledge Representations in Language Models
☆119Updated 2 years ago
jayelm / gisting
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
☆300Updated 9 months ago
facebookresearch / atlas
Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03…
☆551Updated 2 years ago
nelson-liu / lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
☆365Updated last year
SinclairCoder / Instruction-Tuning-Papers
Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).
☆768Updated 2 years ago
PKU-Alignment / beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
☆169Updated 2 years ago
AlexTMallen / adaptive-retrieval
☆189Updated 5 months ago
princeton-nlp / ALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
☆501Updated last year