danielgrittner / nanoGPT-LoRALinks
The simplest, fastest repository for training/finetuning medium-sized GPTs with LoRA support.
☆29Updated last year
Alternatives and similar repositories for nanoGPT-LoRA
Users that are interested in nanoGPT-LoRA are comparing it to the libraries listed below
Sorting:
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆469Updated last year
- This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.1…☆136Updated 2 years ago
- An Extensible Continual Learning Framework Focused on Language Models (LMs)☆291Updated last year
- ☆84Updated 2 years ago
- Reverse Instructions to generate instruction tuning data with corpus examples☆216Updated last year
- Mass-editing thousands of facts into a transformer memory (ICLR 2023)☆532Updated last year
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model☆561Updated 10 months ago
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023☆138Updated last year
- Locating and editing factual associations in GPT (NeurIPS 2022)☆707Updated last year
- Tk-Instruct is a Transformer model that is tuned to solve many NLP tasks by following instructions.☆182Updated 3 years ago
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆321Updated last year
- Datasets for Instruction Tuning of Large Language Models☆259Updated 2 years ago
- How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning☆25Updated last year
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆551Updated last year
- ☆249Updated 2 years ago
- Wrapper to easily generate the chat template for Llama2☆65Updated last year
- Explorations into some recent techniques surrounding speculative decoding☆295Updated 11 months ago
- ☆180Updated 2 years ago
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆98Updated 2 years ago
- A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.☆384Updated 2 years ago
- Sparse probing paper full code.☆65Updated 2 years ago
- Simple next-token-prediction for RLHF☆227Updated 2 years ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆180Updated last year
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆80Updated last year
- Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"☆21Updated last year
- Instruct-tune Open LLaMA / RedPajama / StableLM models on consumer hardware using QLoRA☆81Updated 2 years ago
- Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.☆168Updated this week
- Code and data for "Lost in the Middle: How Language Models Use Long Contexts"☆368Updated last year
- ☆272Updated 2 years ago
- ☆98Updated 2 years ago