explodinggradients / nemesis
Reward Model framework for LLM RLHF
☆58Updated last year
Related projects ⓘ
Alternatives and complementary repositories for nemesis
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated 9 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated last month
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆109Updated last year
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆34Updated last year
- minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever☆33Updated last week
- ☆41Updated 2 weeks ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆87Updated last year
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆83Updated 2 months ago
- A repository for transformer critique learning and generation☆85Updated 11 months ago
- ☆112Updated last month
- ☆33Updated 6 months ago
- Open Implementations of LLM Analyses☆94Updated last month
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆107Updated last year
- Evaluating LLMs with fewer examples☆135Updated 7 months ago
- ☆87Updated 9 months ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆31Updated 11 months ago
- ☆37Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning☆42Updated 11 months ago
- ☆22Updated 4 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆41Updated 9 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆124Updated 3 weeks ago
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆47Updated 7 months ago
- ☆24Updated last year
- Score LLM pretraining data with classifiers☆54Updated last year
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆94Updated 5 months ago
- ☆127Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- ☆73Updated 10 months ago
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆112Updated last year