This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆222Aug 10, 2023Updated 2 years ago
Alternatives and similar repositories for Shepherd
Users that are interested in Shepherd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆228Jun 6, 2023Updated 2 years ago
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 3 years ago
- A large-scale, fine-grained, diverse preference dataset (and models).☆364Dec 29, 2023Updated 2 years ago
- [ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically d…☆312Nov 11, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆121Aug 16, 2023Updated 2 years ago
- Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment☆1,038May 31, 2024Updated last year
- Generative Judge for Evaluating Alignment☆248Jan 18, 2024Updated 2 years ago
- ☆44Jun 2, 2024Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated 2 years ago
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆2,106Jun 1, 2023Updated 2 years ago
- Salesforce open-source LLMs with 8k sequence length.☆726Jan 31, 2025Updated last year
- ☆284Jan 6, 2025Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆48Aug 1, 2023Updated 2 years ago
- Dromedary: towards helpful, ethical and reliable LLMs.☆1,144Sep 18, 2025Updated 6 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆35Aug 9, 2023Updated 2 years ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Sep 23, 2023Updated 2 years ago
- FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback☆12Jul 13, 2022Updated 3 years ago
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆209May 24, 2023Updated 2 years ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆907Sep 30, 2025Updated 5 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Self-Alignment with Principle-Following Reward Models☆170Sep 18, 2025Updated 6 months ago
- FacTool: Factuality Detection in Generative AI☆918Aug 19, 2024Updated last year
- AllenAI's post-training codebase☆3,643Updated this week
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆552Mar 10, 2024Updated 2 years ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated last year
- O1 Replication Journey☆1,999Jan 14, 2025Updated last year
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,253Feb 8, 2026Updated last month
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆978Oct 22, 2024Updated last year
- Secrets of RLHF in Large Language Models Part I: PPO☆1,422Mar 3, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆41May 22, 2025Updated 10 months ago
- ☆23Aug 7, 2023Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆320Dec 20, 2023Updated 2 years ago
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,832Jun 17, 2025Updated 9 months ago
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆843Jul 1, 2024Updated last year
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆49Nov 29, 2024Updated last year
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,065Mar 7, 2024Updated 2 years ago