sunildkumar / lora_from_scratchLinks
Implements Low-Rank Adaptation(LoRA) Finetuning from scratch
☆77Updated 2 years ago
Alternatives and similar repositories for lora_from_scratch
Users that are interested in lora_from_scratch are comparing it to the libraries listed below
Sorting:
- Collection of autoregressive model implementation☆85Updated 2 months ago
- LoRA and DoRA from Scratch Implementations☆206Updated last year
- Implementation of the Llama architecture with RLHF + Q-learning☆165Updated 5 months ago
- ☆92Updated last year
- Website for hosting the Open Foundation Models Cheat Sheet.☆267Updated 2 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆101Updated 6 months ago
- minimal GRPO implementation from scratch☆92Updated 4 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆147Updated 2 weeks ago
- A puzzle to learn about prompting☆131Updated 2 years ago
- ☆87Updated last year
- Understand and test language model architectures on synthetic tasks.☆219Updated last month
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated 8 months ago
- A comprehensive deep dive into the world of tokens☆224Updated last year
- Prune transformer layers☆69Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated last year
- ☆81Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆160Updated 4 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆123Updated 6 months ago
- An introduction to LLM Sampling☆79Updated 7 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- Highly commented implementations of Transformers in PyTorch☆136Updated last year
- Implementation of DoRA☆296Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆101Updated 4 months ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- code for training & evaluating Contextual Document Embedding models☆194Updated 2 months ago
- ☆61Updated last year
- ☆49Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆190Updated last year
- Supercharge huggingface transformers with model parallelism.☆77Updated 9 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆282Updated 4 months ago