ethanyanjiali / minChatGPTLinks

A minimum example of aligning language models with RLHF similar to ChatGPT

☆220

Alternatives and similar repositories for minChatGPT

Users that are interested in minChatGPT are comparing it to the libraries listed below

Sorting:

Dahoas / reward-modeling
☆96Updated 2 years ago
xrsrke / instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF)
☆171Updated 2 years ago
thomfoster / minRLHF
A (somewhat) minimal library for finetuning language models with PPO on human feedback.
☆85Updated 2 years ago
sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆291Updated last year
booydar / LM-RMT
Recurrent Memory Transformer
☆150Updated last year
imoneoi / multipack_sampler
Multipack distributed sampler for fast padding-free training of LLMs
☆199Updated 11 months ago
kyleliang919 / Long-context-transformers
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆116Updated 2 years ago
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆167Updated last year
lucidrains / recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
☆412Updated 6 months ago
linhduongtuan / BLOOM-LORA
Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…
☆184Updated 2 years ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆338Updated last month
lucidrains / CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆228Updated 10 months ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆227Updated last year
vwxyzjn / lm-human-preference-details
RLHF implementation details of OAI's 2019 codebase
☆187Updated last year
lucidrains / MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
☆646Updated 7 months ago
SeanNaren / min-LLM
Minimal code to train a Large Language Model (LLM).
☆170Updated 3 years ago
lucidrains / simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
☆215Updated 11 months ago
Guitaricet / relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
☆458Updated last year
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆208Updated last year
hydrallm / llama-moe-v1
☆95Updated 2 years ago
neelsjain / NEFTune
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
☆397Updated last year
mzbac / llama2-fine-tune
Scripts for fine-tuning Llama2 via SFT and DPO.
☆201Updated last year
kyegomez / Sophia
Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.
☆379Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆166Updated 6 months ago
tianjunz / HIR
☆159Updated 2 years ago
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated 10 months ago
sabetAI / BLoRA
batched loras
☆344Updated last year
zphang / minimal-llama
☆458Updated last year
tomekkorbak / pretraining-with-human-feedback
Code accompanying the paper Pretraining Language Models with Human Preferences
☆182Updated last year
jondurbin / bagel
A bagel, with everything.
☆323Updated last year