angry-kratos / Simple_Llama3_from_scratchLinks
☆31Updated 11 months ago
Alternatives and similar repositories for Simple_Llama3_from_scratch
Users that are interested in Simple_Llama3_from_scratch are comparing it to the libraries listed below
Sorting:
- ☆36Updated 2 weeks ago
- ☆39Updated last month
- ☆130Updated 9 months ago
- working implimention of deepseek MLA☆41Updated 4 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- Hub for researchers exploring VLMs and Multimodal Learning:)☆37Updated this week
- Collection of autoregressive model implementation☆85Updated last month
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆46Updated last year
- This is the code that went into our practical dive using mamba as information extraction☆53Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- My fork os allen AI's OLMo for educational purposes.☆30Updated 6 months ago
- Quantization of LLMs and benchmarking.☆10Updated last year
- ☆46Updated 2 months ago
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆58Updated 2 weeks ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 7 months ago
- Fine tune Gemma 3 on an object detection task☆46Updated this week
- A competition to get you started on the NeurIPS AI Hackercup☆28Updated 8 months ago
- This repository contains a better implementation of Kolmogorov-Arnold networks☆61Updated this week
- making the official triton tutorials actually comprehensible☆36Updated 2 months ago
- Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.☆115Updated last year
- ☆47Updated 9 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆100Updated 5 months ago
- We study toy models of skill learning.☆28Updated 4 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- rl from zero pretrain, can it be done? we'll see.☆24Updated last week
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Updated 3 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆93Updated 5 months ago
- GPU Kernels☆179Updated last month
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆117Updated last year
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆151Updated last year