facebookresearch / ShepherdLinks

This is the repo for the paper Shepherd -- A Critic for Language Model Generation

☆219

Alternatives and similar repositories for Shepherd

Users that are interested in Shepherd are comparing it to the libraries listed below

Sorting:

bhargaviparanjape / language-programmes
☆173Updated 2 years ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆227Updated 2 years ago
FranxYao / GPT-Bargaining
Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback
☆208Updated 2 years ago
night-chen / ToolQA
ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …
☆282Updated 2 years ago
IBM / SALMON
Self-Alignment with Principle-Following Reward Models
☆169Updated 2 months ago
kaistAI / CoT-Collection
[EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
☆250Updated 2 years ago
SALT-NLP / demonstrated-feedback
☆129Updated last year
akoksal / LongForm
Reverse Instructions to generate instruction tuning data with corpus examples
☆216Updated last year
neulab / gemini-benchmark
☆150Updated last year
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆209Updated last year
nlpxucan / evol-instruct
☆277Updated 2 years ago
Re-Align / URIAL
☆313Updated last year
jayelm / gisting
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
☆300Updated 9 months ago
allenai / WildBench
Benchmarking LLMs with Challenging Tasks from Real Users
☆246Updated last year
sambanova / toolbench
ToolBench, an evaluation suite for LLM tool manipulation capabilities.
☆164Updated last year
veronica320 / Faithful-COT
Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".
☆164Updated last year
orhonovich / unnatural-instructions
☆180Updated 2 years ago
TIGER-AI-Lab / MAmmoTH2
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆149Updated last year
p-lambda / dsir
DSIR large-scale data selection framework for language model training
☆266Updated last year
tianjunz / HIR
☆159Updated 2 years ago
google / sycophancy-intervention
Scripts for generating synthetic finetuning data for reducing sycophancy.
☆117Updated 2 years ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆342Updated 5 months ago
anthropics / ConstitutionalHarmlessnessPaper
☆248Updated 2 years ago
shizhediao / active-prompt
Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"
☆247Updated last year
kaistAI / FLASK
[ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
☆218Updated last year
yueyu1030 / AttrPrompt
[NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.
☆156Updated 2 years ago
kaistAI / SelFee
Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"
☆228Updated 2 years ago
OpenBMB / UltraFeedback
A large-scale, fine-grained, diverse preference dataset (and models).
☆356Updated last year
wang-research-lab / agentinstruct
Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"
☆117Updated last month
Anni-Zou / Meta-CoT
Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models
☆99Updated 2 years ago