sunildkumar / lora_from_scratch
Implements Low-Rank Adaptation(LoRA) Finetuning from scratch
☆74Updated last year
Alternatives and similar repositories for lora_from_scratch:
Users that are interested in lora_from_scratch are comparing it to the libraries listed below
- Collection of autoregressive model implementation☆85Updated 2 months ago
- ☆49Updated last year
- ☆78Updated 9 months ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆91Updated last year
- ☆92Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- LoRA and DoRA from Scratch Implementations☆202Updated last year
- An introduction to LLM Sampling☆77Updated 4 months ago
- Implementation of the Llama architecture with RLHF + Q-learning☆164Updated 2 months ago
- Fast bare-bones BPE for modern tokenizer training☆154Updated 3 weeks ago
- Highly commented implementations of Transformers in PyTorch☆136Updated last year
- Set of scripts to finetune LLMs☆37Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 9 months ago
- ☆79Updated last year
- σ-GPT: A New Approach to Autoregressive Models☆62Updated 8 months ago
- Understand and test language model architectures on synthetic tasks.☆192Updated last month
- Cerule - A Tiny Mighty Vision Model☆67Updated 7 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆98Updated 4 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 8 months ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆146Updated last year
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆103Updated last month
- ☆53Updated last year
- Prune transformer layers☆68Updated 10 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- A comprehensive deep dive into the world of tokens☆222Updated 10 months ago
- Erasing concepts from neural representations with provable guarantees☆227Updated 3 months ago
- minimal GRPO implementation from scratch☆85Updated last month
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 2 months ago
- ☆67Updated 8 months ago