sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆289Updated last year
Alternatives and similar repositories for nanoChatGPT:
Users that are interested in nanoChatGPT are comparing it to the libraries listed below
- Used for adaptive human in the loop evaluation of language and embedding models.☆309Updated 2 years ago
- ☆458Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 8 months ago
- ☆412Updated last year
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆407Updated 4 months ago
- A bagel, with everything.☆320Updated last year
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆345Updated 9 months ago
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆632Updated last year
- A minimum example of aligning language models with RLHF similar to ChatGPT☆217Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆423Updated last year
- batched loras☆341Updated last year
- ☆159Updated 2 years ago
- A puzzle to learn about prompting☆127Updated last year
- ☆94Updated last year
- A repository for research on medium sized language models.☆495Updated last week
- ☆92Updated last year
- Language Modeling with the H3 State Space Model☆520Updated last year
- OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA☆302Updated last year
- This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…☆351Updated last year
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆211Updated 8 months ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated 2 years ago
- Simple next-token-prediction for RLHF☆225Updated last year
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆196Updated last year
- JAX implementation of the Llama 2 model☆218Updated last year
- Recurrent Memory Transformer☆149Updated last year
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆371Updated last year
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆206Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆629Updated last year
- Code repository for the c-BTM paper☆106Updated last year