explodinggradients / nemesis
Reward Model framework for LLM RLHF
β60Updated last year
Alternatives and similar repositories for nemesis:
Users that are interested in nemesis are comparing it to the libraries listed below
- Small and Efficient Mathematical Reasoning LLMsβ71Updated last year
- Supervised instruction finetuning for LLM with HF trainer and Deepspeedβ34Updated last year
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β67Updated 4 months ago
- β117Updated 4 months ago
- β24Updated last year
- β48Updated 3 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Modelsβ94Updated last year
- Codebase accompanying the Summary of a Haystack paper.β74Updated 5 months ago
- [SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Modelsβ22Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillationβ28Updated 2 weeks ago
- β32Updated 7 months ago
- β66Updated last year
- Open Implementations of LLM Analysesβ98Updated 4 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forβ¦β24Updated 2 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."β63Updated last year
- A repository for transformer critique learning and generationβ88Updated last year
- Code for the paper "LASER: LLM Agent with State-Space Exploration for Web Navigation"β32Updated last year
- β47Updated 7 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Modelsβ69Updated last year
- Pre-training code for CrystalCoder 7B LLMβ55Updated 9 months ago
- Based on the tree of thoughts paperβ46Updated last year
- About The corresponding code from our paper " REFINER: Reasoning Feedback on Intermediate Representations" (EACL 2024). Do not hesitate tβ¦β70Updated last year
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"β52Updated 4 months ago
- A set of utilities for running few-shot prompting experiments on large-language modelsβ118Updated last year
- β37Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response formatβ27Updated last year
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"β83Updated 6 months ago
- A library for squeakily cleaning and filtering language datasets.β46Updated last year
- β27Updated this week