sunildkumar / lora_from_scratchLinks
Implements Low-Rank Adaptation(LoRA) Finetuning from scratch
☆81Updated 2 years ago
Alternatives and similar repositories for lora_from_scratch
Users that are interested in lora_from_scratch are comparing it to the libraries listed below
Sorting:
- Collection of autoregressive model implementation☆86Updated 6 months ago
 - Implementation of the Llama architecture with RLHF + Q-learning☆167Updated 9 months ago
 - LoRA and DoRA from Scratch Implementations☆211Updated last year
 - minimal GRPO implementation from scratch☆98Updated 7 months ago
 - A comprehensive deep dive into the world of tokens☆226Updated last year
 - ☆88Updated last year
 - code for training & evaluating Contextual Document Embedding models☆199Updated 5 months ago
 - σ-GPT: A New Approach to Autoregressive Models☆68Updated last year
 - An introduction to LLM Sampling☆79Updated 10 months ago
 - The simplest, fastest repository for training/finetuning medium-sized GPTs.☆170Updated 4 months ago
 - This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆91Updated 2 years ago
 - Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆177Updated last year
 - Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆105Updated 2 years ago
 - This is the code that went into our practical dive using mamba as information extraction☆56Updated last year
 - An extension of the nanoGPT repository for training small MOE models.☆205Updated 7 months ago
 - A set of scripts and notebooks on LLM finetunning and dataset creation☆110Updated last year
 - Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆230Updated last year
 - Toolkit for attaching, training, saving and loading of new heads for transformer models☆289Updated 7 months ago
 - ☆50Updated last year
 - Prune transformer layers☆69Updated last year
 - ☆91Updated last year
 - Implementation of DoRA☆304Updated last year
 - $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆147Updated last month
 - ☆81Updated last year
 - Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆222Updated last year
 - Set of scripts to finetune LLMs☆38Updated last year
 - ☆94Updated 2 years ago
 - A collection of various LLM sampling methods implemented in pure Pytorch☆22Updated 10 months ago
 - Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆119Updated last year
 - Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago