This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆222Aug 10, 2023Updated 2 years ago
Alternatives and similar repositories for Shepherd
Users that are interested in Shepherd are comparing it to the libraries listed below
Sorting:
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆228Jun 6, 2023Updated 2 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated last year
- ☆44Jun 2, 2024Updated last year
- [ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically d…☆311Nov 11, 2023Updated 2 years ago
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆121Aug 16, 2023Updated 2 years ago
- Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment☆1,036May 31, 2024Updated last year
- A large-scale, fine-grained, diverse preference dataset (and models).☆363Dec 29, 2023Updated 2 years ago
- ☆41May 22, 2025Updated 9 months ago
- [NAACL 2024] Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? https://aclanthology.org/2024.naa…☆55Jul 31, 2025Updated 7 months ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆2,094Jun 1, 2023Updated 2 years ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Sep 23, 2023Updated 2 years ago
- Self-Alignment with Principle-Following Reward Models☆169Sep 18, 2025Updated 5 months ago
- Salesforce open-source LLMs with 8k sequence length.☆725Jan 31, 2025Updated last year
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 2 years ago
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 3 years ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- ☆282Jan 6, 2025Updated last year
- Dromedary: towards helpful, ethical and reliable LLMs.☆1,144Sep 18, 2025Updated 5 months ago
- Generative Judge for Evaluating Alignment☆250Jan 18, 2024Updated 2 years ago
- 🐙 OctoPack: Instruction Tuning Code Large Language Models☆478Feb 5, 2025Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆53Aug 10, 2025Updated 6 months ago
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆976Oct 22, 2024Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆35Aug 9, 2023Updated 2 years ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆906Sep 30, 2025Updated 5 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Dec 13, 2024Updated last year
- FacTool: Factuality Detection in Generative AI☆913Aug 19, 2024Updated last year
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆552Mar 10, 2024Updated last year
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,187Feb 8, 2026Updated 3 weeks ago
- Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)☆167Oct 15, 2023Updated 2 years ago
- [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition☆669Jul 22, 2024Updated last year
- Secrets of RLHF in Large Language Models Part I: PPO☆1,416Mar 3, 2024Updated 2 years ago
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,065Mar 7, 2024Updated last year
- AllenAI's post-training codebase☆3,592Feb 24, 2026Updated last week
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆48Aug 1, 2023Updated 2 years ago
- Official implementation of TransNormerLLM: A Faster and Better LLM☆252Jan 23, 2024Updated 2 years ago