sunildkumar / lora_from_scratch
Implements Low-Rank Adaptation(LoRA) Finetuning from scratch
☆71Updated last year
Alternatives and similar repositories for lora_from_scratch:
Users that are interested in lora_from_scratch are comparing it to the libraries listed below
- Collection of autoregressive model implementation☆81Updated last week
- Highly commented implementations of Transformers in PyTorch☆132Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆87Updated last year
- LoRA and DoRA from Scratch Implementations☆196Updated 11 months ago
- ☆75Updated 7 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆262Updated 2 weeks ago
- ☆92Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆191Updated 7 months ago
- ☆113Updated 4 months ago
- Implementation of the Llama architecture with RLHF + Q-learning☆162Updated 2 weeks ago
- ☆78Updated 10 months ago
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆102Updated last year
- σ-GPT: A New Approach to Autoregressive Models☆61Updated 6 months ago
- Fast bare-bones BPE for modern tokenizer training☆146Updated 4 months ago
- A comprehensive deep dive into the world of tokens☆220Updated 7 months ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆197Updated 9 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆120Updated 6 months ago
- Implementation of DoRA☆290Updated 8 months ago
- An introduction to LLM Sampling☆75Updated 2 months ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆65Updated 9 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆57Updated 3 weeks ago
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆99Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆185Updated 8 months ago
- Understand and test language model architectures on synthetic tasks.☆181Updated last month
- some common Huggingface transformers in maximal update parametrization (µP)☆78Updated 2 years ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆95Updated last month
- Data preparation code for Amber 7B LLM☆85Updated 9 months ago
- ☆53Updated last year