FareedKhan-dev / create-million-parameter-llm-from-scratchLinks
Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.
☆170Updated last year
Alternatives and similar repositories for create-million-parameter-llm-from-scratch
Users that are interested in create-million-parameter-llm-from-scratch are comparing it to the libraries listed below
Sorting:
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆166Updated 9 months ago
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆60Updated 2 months ago
- From scratch implementation of a vision language model in pure PyTorch☆220Updated last year
- A Straightforward, Step-by-Step Implementation of a Video Diffusion Model☆45Updated 2 weeks ago
- Building LLaMA 4 MoE from Scratch☆52Updated last month
- ☆39Updated last month
- ☆31Updated 11 months ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆122Updated last year
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆167Updated this week
- Various installation guides for Large Language Models☆69Updated last month
- Distributed training (multi-node) of a Transformer model☆68Updated last year
- a simplified version of Meta's Llama 3 model to be used for learning☆41Updated last year
- Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)☆281Updated 2 years ago
- GPU Kernels☆179Updated last month
- a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model☆144Updated 11 months ago
- ☆89Updated 2 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆32Updated 2 weeks ago
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- ☆32Updated 6 months ago
- Notes and commented code for RLHF (PPO)☆96Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆111Updated 8 months ago
- Maximizing the Performance of a Simple RAG using RL☆61Updated 2 months ago
- RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systems☆106Updated 4 months ago
- Reference implementation of Mistral AI 7B v0.1 model.☆29Updated last year
- Notes about LLaMA 2 model☆61Updated last year
- A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.☆288Updated this week
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆105Updated last year
- Learn Generative AI with PyTorch (Manning Publications, 2024)☆100Updated 6 months ago
- LLM (Large Language Model) FineTuning☆540Updated 2 months ago
- ☆83Updated last year