ckkissane / rlhf-shakespeareLinks

Shakespeare transformer fine-tuned to generate positive sentiment samples using RLHF

☆9

Alternatives and similar repositories for rlhf-shakespeare

Users that are interested in rlhf-shakespeare are comparing it to the libraries listed below

Sorting:

r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆43Updated last year
pacman100 / peft-codegen-25
☆23Updated 2 years ago
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆85Updated last year
luyug / magix
Supercharge huggingface transformers with model parallelism.
☆77Updated 9 months ago
akjindal53244 / Arithmo
Small and Efficient Mathematical Reasoning LLMs
☆71Updated last year
austrian-code-wizard / c3po
☆27Updated last week
LAION-AI / math_problems-step-by-step_solutions
Here we provide and collect many functions to generate math problem and step by step solutions for LLM training
☆17Updated 2 years ago
facebookresearch / lss_eval
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Updated last year
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated 10 months ago
cloneofsimo / fim-llama-deepspeed
☆31Updated last year
huggingface / peft-pytorch-conference
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆14Updated last year
scottlogic-alex / prm800k-denorm
Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format
☆27Updated 2 years ago
minyoungg / LTE
☆68Updated last year
CarperAI / autocrit
A repository for transformer critique learning and generation
☆90Updated last year
likenneth / q_probe
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
☆41Updated last year
huu4ontocord / MDEL
Multi-Domain Expert Learning
☆67Updated last year
deep-diver / LLM-Pref-Mark-UI
☆37Updated 2 years ago
Upaya07 / NeurIPS-llm-efficiency-challenge
Code for NeurIPS LLM Efficiency Challenge
☆59Updated last year
cmu-l3 / neurips2024-inference-tutorial-code
NeurIPS 2024 tutorial on LLM Inference
☆45Updated 7 months ago
lvwerra / rl-implementations
This repo contains a set of notebooks to reproduce reinforcement learning algorithms.
☆15Updated 2 years ago
arcee-ai / DAM
☆52Updated 8 months ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆18Updated 5 months ago
haileyschoelkopf / triton-index
See https://github.com/cuda-mode/triton-index/ instead!
☆11Updated last year
CarperAI / squeakily
A library for squeakily cleaning and filtering language datasets.
☆47Updated 2 years ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated 10 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year
CarperAI / treasure_trove
☆22Updated last year
kernelmachine / silo-lm
SILO Language Models code repository
☆81Updated last year
ruiqi-zhong / D5
The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions
☆70Updated 2 years ago
seonghyeonye / Flipped-Learning
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆116Updated 2 weeks ago