Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
☆626Feb 24, 2025Updated last year
Alternatives and similar repositories for Deepdive-llama3-from-scratch
Users that are interested in Deepdive-llama3-from-scratch are comparing it to the libraries listed below
Sorting:
- ☆1,465Feb 15, 2025Updated last year
- Animating R1's thoughts.☆383Feb 17, 2025Updated last year
- Docker-based inference engine for AMD GPUs☆233Oct 7, 2024Updated last year
- A browser-based, WebGL2 implementation of GPT-2 with transform block and attention matrix visualization☆342Oct 24, 2025Updated 4 months ago
- A reimplementation of Stable Diffusion 3.5 in pure PyTorch☆695Jun 14, 2025Updated 8 months ago
- Run and explore Llama models locally with minimal dependencies on CPU☆188Oct 12, 2024Updated last year
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆630Mar 23, 2025Updated 11 months ago
- A TypeScript library to create platform-agnostic applications☆71Feb 21, 2026Updated last week
- Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit …☆362May 21, 2025Updated 9 months ago
- Migrate from Docker to Podman.☆383Apr 2, 2025Updated 11 months ago
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆227Dec 24, 2024Updated last year
- Deep Reinforcement Learning: Zero to Hero!☆2,262Oct 27, 2025Updated 4 months ago
- Examples and guides for using the VLM Run API☆306Jan 27, 2026Updated last month
- Things you can do with the token embeddings of an LLM☆1,453Dec 1, 2025Updated 3 months ago
- ☆200May 5, 2025Updated 10 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,219Aug 27, 2025Updated 6 months ago
- Neurox control helm chart details☆30Apr 29, 2025Updated 10 months ago
- LLM Analytics☆707Oct 19, 2024Updated last year
- See Through Your Models☆400Jul 8, 2025Updated 7 months ago
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)☆682May 20, 2025Updated 9 months ago
- A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.☆3,894Updated this week
- Fully neural approach for text chunking☆406Oct 23, 2025Updated 4 months ago
- Minimal LLM inference in Rust☆1,032Oct 24, 2024Updated last year
- High-Performance Implementation of OpenAI's TikToken.☆473Jul 3, 2025Updated 8 months ago
- Personal Site☆20Jan 11, 2026Updated last month
- ☆10Feb 14, 2025Updated last year
- Live-bending a foundation model’s output at neural network level.☆273Apr 7, 2025Updated 10 months ago
- llama3 implementation one matrix multiplication at a time☆15,242May 23, 2024Updated last year
- Have a natural, spoken conversation with AI!☆3,552Jul 11, 2025Updated 7 months ago
- Create mind maps to learn new things using AI.☆568Nov 2, 2024Updated last year
- Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.☆23Nov 26, 2025Updated 3 months ago
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆448Nov 24, 2025Updated 3 months ago
- OpenCV+YOLO+LLAVA powered video surveillance system☆785Oct 21, 2025Updated 4 months ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆885Dec 10, 2025Updated 2 months ago
- A powerful document AI question-answering tool that connects to your local Ollama models. Create, manage, and interact with RAG systems f…☆1,096Aug 9, 2025Updated 6 months ago
- webgl image editing library with filters and effects☆32Dec 13, 2025Updated 2 months ago
- Dead Simple LLM Abliteration☆251Feb 18, 2025Updated last year
- Fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models.☆666Jul 7, 2025Updated 8 months ago
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆790Oct 30, 2024Updated last year