explodinggradients / nemesisLinks
Reward Model framework for LLM RLHF
β61Updated 2 years ago
Alternatives and similar repositories for nemesis
Users that are interested in nemesis are comparing it to the libraries listed below
Sorting:
- Supervised instruction finetuning for LLM with HF trainer and Deepspeedβ36Updated 2 years ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β78Updated 11 months ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ110Updated 9 months ago
- Small and Efficient Mathematical Reasoning LLMsβ72Updated last year
- Open Implementations of LLM Analysesβ107Updated last year
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testingβ52Updated 11 months ago
- Retrieval Augmented Generation Generalized Evaluation Datasetβ56Updated 2 months ago
- β85Updated 2 years ago
- β88Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β75Updated 11 months ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agentsβ24Updated 3 years ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β116Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."β64Updated 2 years ago
- PyTorch implementation for MRLβ19Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Modelsβ98Updated last year
- β23Updated 2 years ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- β43Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ66Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillationβ29Updated 8 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.β153Updated last year
- Based on the tree of thoughts paperβ48Updated 2 years ago
- β29Updated 2 months ago
- β95Updated 9 months ago
- A repository for transformer critique learning and generationβ90Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"β107Updated 2 years ago
- Flacuna was developed by fine-tuning Vicuna on Flan-mini, a comprehensive instruction collection encompassing various tasks. Vicuna is alβ¦β111Updated 2 years ago
- Scalable Meta-Evaluation of LLMs as Evaluatorsβ42Updated last year
- [NAACL 2024] Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? https://aclanthology.org/2024.naaβ¦β55Updated 2 months ago