This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆222Aug 10, 2023Updated 2 years ago
Alternatives and similar repositories for Shepherd
Users that are interested in Shepherd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Evaluate the Quality of Critique☆37Jun 1, 2024Updated last year
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆227Jun 6, 2023Updated 2 years ago
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 3 years ago
- A large-scale, fine-grained, diverse preference dataset (and models).☆367Dec 29, 2023Updated 2 years ago
- [ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically d…☆314Nov 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆121Aug 16, 2023Updated 2 years ago
- Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment☆1,037May 31, 2024Updated last year
- Generative Judge for Evaluating Alignment☆249Jan 18, 2024Updated 2 years ago
- ☆44Jun 2, 2024Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated 2 years ago
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆2,115Jun 1, 2023Updated 2 years ago
- Salesforce open-source LLMs with 8k sequence length.☆726Jan 31, 2025Updated last year
- ☆284Jan 6, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆47Aug 1, 2023Updated 2 years ago
- Dromedary: towards helpful, ethical and reliable LLMs.☆1,142Sep 18, 2025Updated 6 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆34Aug 9, 2023Updated 2 years ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Sep 23, 2023Updated 2 years ago
- FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback☆12Jul 13, 2022Updated 3 years ago
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆208May 24, 2023Updated 2 years ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆903Sep 30, 2025Updated 6 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Self-Alignment with Principle-Following Reward Models☆170Sep 18, 2025Updated 6 months ago
- FacTool: Factuality Detection in Generative AI☆924Aug 19, 2024Updated last year
- AllenAI's post-training codebase☆3,683Updated this week
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆552Mar 10, 2024Updated 2 years ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated 2 years ago
- O1 Replication Journey☆1,999Jan 14, 2025Updated last year
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆977Oct 22, 2024Updated last year
- Secrets of RLHF in Large Language Models Part I: PPO☆1,421Mar 3, 2024Updated 2 years ago
- ☆41May 22, 2025Updated 10 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,316Feb 8, 2026Updated 2 months ago
- ☆23Aug 7, 2023Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆319Dec 20, 2023Updated 2 years ago
- Code for the paper "Fishing for Magikarp"☆182Updated this week
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,837Jun 17, 2025Updated 9 months ago
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆842Jul 1, 2024Updated last year
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆49Nov 29, 2024Updated last year