sunildkumar / lora_from_scratchLinks

Implements Low-Rank Adaptation(LoRA) Finetuning from scratch

☆82

Alternatives and similar repositories for lora_from_scratch

Users that are interested in lora_from_scratch are comparing it to the libraries listed below

Sorting:

lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆168Updated 10 months ago
SumanthRH / tokenization
A comprehensive deep dive into the world of tokens
☆227Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆85Updated 7 months ago
geronimi73 / phi2-finetune
☆86Updated last year
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆215Updated last year
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 11 months ago
center-for-humans-and-machines / transformer-heads
Toolkit for attaching, training, saving and loading of new heads for transformer models
☆292Updated 9 months ago
rasbt / pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…
☆92Updated 2 years ago
melisa-writer / short-transformers
Prune transformer layers
☆74Updated last year
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆103Updated 11 months ago
abacaj / train-with-fsdp
☆94Updated 2 years ago
Oxen-AI / mamba-dive
This is the code that went into our practical dive using mamba as information extraction
☆57Updated last year
warner-benjamin / commented-transformers
Highly commented implementations of Transformers in PyTorch
☆139Updated 2 years ago
epfml / DenseFormer
☆82Updated last year
Pleias / Various-Finetuning
Set of scripts to finetune LLMs
☆38Updated last year
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆201Updated 6 months ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆232Updated last year
lucidrains / CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Updated last year
lucidrains / pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…
☆125Updated last year
johnma2006 / candle
Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.
☆53Updated last year
lucidrains / simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
☆224Updated last year
wolfecameron / lora_instruction_tune
☆38Updated last year
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆61Updated last year
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆179Updated last year
NousResearch / StripedHyenaTrainer
☆62Updated last year
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆269Updated 7 months ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆100Updated last year
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆70Updated last year
CG80499 / KAN-GPT-2
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆122Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 9 months ago