angry-kratos / Simple_Llama3_from_scratchLinks
☆30Updated last year
Alternatives and similar repositories for Simple_Llama3_from_scratch
Users that are interested in Simple_Llama3_from_scratch are comparing it to the libraries listed below
Sorting:
- Collection of autoregressive model implementation☆86Updated 4 months ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.☆51Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆56Updated last year
- Quantization of LLMs and benchmarking.☆10Updated last year
- ☆43Updated 3 months ago
- Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.☆120Updated last year
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆121Updated last year
- ☆46Updated 4 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆178Updated last month
- ☆134Updated last year
- A compact LLM pretrained in 9 days by using high quality data☆322Updated 4 months ago
- From scratch implementation of a vision language model in pure PyTorch☆235Updated last year
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊☆129Updated 2 weeks ago
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆30Updated 6 months ago
- Prune transformer layers☆69Updated last year
- ☆88Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆101Updated 8 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 10 months ago
- ☆44Updated 3 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- several types of attention modules written in PyTorch for learning purposes☆53Updated 10 months ago
- Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)☆53Updated last year
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆83Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated 9 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last month
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆65Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆96Updated 8 months ago
- Fine tune Gemma 3 on an object detection task☆78Updated last month