explodinggradients / nemesisLinks
Reward Model framework for LLM RLHF
β61Updated 2 years ago
Alternatives and similar repositories for nemesis
Users that are interested in nemesis are comparing it to the libraries listed below
Sorting:
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated 9 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeedβ35Updated 2 years ago
- Small and Efficient Mathematical Reasoning LLMsβ71Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ105Updated 7 months ago
- Open Implementations of LLM Analysesβ105Updated 9 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Modelsβ97Updated last year
- PyTorch implementation for MRLβ19Updated last year
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testingβ52Updated 9 months ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated 10 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."β65Updated 2 years ago
- β41Updated last year
- β23Updated 2 years ago
- β48Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first appβ¦β168Updated last year
- Verifiers for LLM Reinforcement Learningβ68Updated 3 months ago
- Finding semantically meaningful and accurate prompts.β47Updated last year
- Resources related to EACL 2023 paper "SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource Domainβ¦β52Updated 2 years ago
- A set of utilities for running few-shot prompting experiments on large-language modelsβ122Updated last year
- β84Updated last year
- Based on the tree of thoughts paperβ48Updated last year
- [NAACL 2024] Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? https://aclanthology.org/2024.naaβ¦β54Updated this week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"β107Updated last year
- Retrieval Augmented Generation Generalized Evaluation Datasetβ54Updated 3 weeks ago
- Evaluating tool-augmented LLMs in conversation settingsβ85Updated last year
- A repository for transformer critique learning and generationβ90Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β75Updated 9 months ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agentsβ24Updated 3 years ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β115Updated 10 months ago
- Pre-training code for CrystalCoder 7B LLMβ55Updated last year