This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆223Aug 10, 2023Updated 2 years ago
Alternatives and similar repositories for Shepherd
Users that are interested in Shepherd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Evaluate the Quality of Critique☆37Jun 1, 2024Updated last year
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆227Jun 6, 2023Updated 2 years ago
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 3 years ago
- A large-scale, fine-grained, diverse preference dataset (and models).☆367Dec 29, 2023Updated 2 years ago
- [ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically d…☆317Nov 11, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆121Aug 16, 2023Updated 2 years ago
- Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment☆1,038May 31, 2024Updated last year
- Generative Judge for Evaluating Alignment☆249Jan 18, 2024Updated 2 years ago
- ☆44Jun 2, 2024Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated 2 years ago
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆2,133Jun 1, 2023Updated 2 years ago
- Salesforce open-source LLMs with 8k sequence length.☆727Jan 31, 2025Updated last year
- ☆284Jan 6, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆47Aug 1, 2023Updated 2 years ago
- Dromedary: towards helpful, ethical and reliable LLMs.☆1,140Sep 18, 2025Updated 8 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆34Aug 9, 2023Updated 2 years ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Sep 23, 2023Updated 2 years ago
- FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback☆12Jul 13, 2022Updated 3 years ago
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆208May 24, 2023Updated 3 years ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆904Sep 30, 2025Updated 7 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Self-Alignment with Principle-Following Reward Models☆170Sep 18, 2025Updated 8 months ago
- FacTool: Factuality Detection in Generative AI☆928Aug 19, 2024Updated last year
- AllenAI's post-training codebase☆3,729Updated this week
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆553Mar 10, 2024Updated 2 years ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated 2 years ago
- O1 Replication Journey☆2,000Jan 14, 2025Updated last year
- Secrets of RLHF in Large Language Models Part I: PPO☆1,427Mar 3, 2024Updated 2 years ago
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆981Oct 22, 2024Updated last year
- ☆41May 22, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆23Aug 7, 2023Updated 2 years ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,444Feb 8, 2026Updated 3 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆321Dec 20, 2023Updated 2 years ago
- Code for the paper "Fishing for Magikarp"☆188Updated this week
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,841Jun 17, 2025Updated 11 months ago
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆845Jul 1, 2024Updated last year
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆48Nov 29, 2024Updated last year