google / sycophancy-intervention
Scripts for generating synthetic finetuning data for reducing sycophancy.
☆109Updated last year
Alternatives and similar repositories for sycophancy-intervention:
Users that are interested in sycophancy-intervention are comparing it to the libraries listed below
- ☆126Updated 5 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆85Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆137Updated 5 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆105Updated last month
- Self-Alignment with Principle-Following Reward Models☆158Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- PASTA: Post-hoc Attention Steering for LLMs☆113Updated 4 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆151Updated last year
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆74Updated 10 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆157Updated 11 months ago
- Unofficial implementation of AlpaGasus☆90Updated last year
- ☆119Updated 6 months ago
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆81Updated 2 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆69Updated 2 years ago
- A repository for transformer critique learning and generation☆89Updated last year
- ☆96Updated 9 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆218Updated last year
- ☆60Updated 11 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆108Updated this week
- Inspecting and Editing Knowledge Representations in Language Models☆115Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆237Updated last year
- ☆172Updated last year
- ☆121Updated 10 months ago
- ☆64Updated 2 years ago
- Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"☆94Updated last year
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆205Updated 10 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆105Updated 7 months ago
- Simple next-token-prediction for RLHF☆224Updated last year