angry-kratos / Simple_Llama3_from_scratchLinks

☆30

Alternatives and similar repositories for Simple_Llama3_from_scratch

Users that are interested in Simple_Llama3_from_scratch are comparing it to the libraries listed below

Sorting:

hkproj / multi-latent-attention
☆45Updated 4 months ago
kabir2505 / tiny-mixtral
☆45Updated 5 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 5 months ago
kmohan321 / Research_Papers
☆46Updated 6 months ago
ariG23498 / quantized-diffusion-inference
Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs
☆38Updated 11 months ago
fangyuan-ksgk / Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆96Updated 9 months ago
ariG23498 / fine-tune-paligemma
Notebooks for fine tuning pali gemma
☆117Updated 5 months ago
cosmo3769 / Quantized-LLMs
Quantization of LLMs and benchmarking.
☆10Updated last year
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆68Updated 4 months ago
sunildkumar / lora_from_scratch
Implements Low-Rank Adaptation(LoRA) Finetuning from scratch
☆80Updated 2 years ago
CG80499 / KAN-GPT-2
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆120Updated last year
mingyin0312 / RL4LLM
RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct
☆30Updated 7 months ago
AviSoori1x / seemore
From scratch implementation of a vision language model in pure PyTorch
☆243Updated last year
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆59Updated last year
ThinamXx / Meta-llama
Complete implementation of Llama2 with/without KV cache & inference 🚀
☆48Updated last year
melisa-writer / short-transformers
Prune transformer layers
☆69Updated last year
rasbt / pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…
☆92Updated 2 years ago
johnma2006 / candle
Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.
☆52Updated last year
tyler-romero / microR1
Simple repository for training small reasoning models
☆40Updated 8 months ago
joey00072 / nanoGRPO
nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)
☆121Updated 5 months ago
ariG23498 / gemma3-object-detection
Fine tune Gemma 3 on an object detection task
☆85Updated 2 months ago
QuixiAI / grokadamw
☆136Updated last year
RiddleHe / llm-interp
A collection of lightweight interpretability scripts to understand how LLMs think
☆56Updated 2 weeks ago
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆211Updated last year
rasbt / RAGs
RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systems
☆133Updated 8 months ago
fangyuan-ksgk / Tiny-GRPO
minimal GRPO implementation from scratch
☆98Updated 6 months ago
hkproj / pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
☆84Updated last year
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆72Updated 5 months ago
geronimi73 / phi2-finetune
☆88Updated last year
HarleyCoops / smolThinker-.5B
A Qwen .5B reasoning model trained on OpenR1-Math-220k
☆14Updated 7 months ago