angry-kratos / Simple_Llama3_from_scratchLinks
☆30Updated last year
Alternatives and similar repositories for Simple_Llama3_from_scratch
Users that are interested in Simple_Llama3_from_scratch are comparing it to the libraries listed below
Sorting:
- Collection of autoregressive model implementation☆86Updated 6 months ago
 - ☆45Updated 5 months ago
 - ☆45Updated 5 months ago
 - RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆30Updated 8 months ago
 - Training small GPT-2 style models using Kolmogorov-Arnold networks.☆121Updated last year
 - ☆46Updated 7 months ago
 - The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
 - Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.☆53Updated last year
 - Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
 - Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated last year
 - My fork os allen AI's OLMo for educational purposes.☆30Updated 10 months ago
 - Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.☆122Updated last year
 - A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆59Updated last year
 - KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆21Updated 3 months ago
 - A collection of lightweight interpretability scripts to understand how LLMs think☆61Updated this week
 - This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆91Updated 2 years ago
 - Repository containing awesome resources regarding Hugging Face tooling.☆48Updated last year
 - Notebooks for fine tuning pali gemma☆117Updated 6 months ago
 - Quantization of LLMs and benchmarking.☆10Updated last year
 - LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆117Updated 2 years ago
 - Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆189Updated last year
 - in this repository, i'm going to implement increasingly complex llm inference optimizations☆70Updated 5 months ago
 - Implements Low-Rank Adaptation(LoRA) Finetuning from scratch☆81Updated 2 years ago
 - The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆109Updated 3 weeks ago
 - Prune transformer layers☆69Updated last year
 - ☆48Updated last year
 - Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆84Updated last year
 - Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆202Updated last year
 - ☆88Updated last year
 - Simple repository for training small reasoning models☆44Updated 8 months ago