Jaykef / micrograd.c
Port of Karpathy's micrograd in pure C. Micrograd is a tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API.
☆25Updated last month
Related projects: ⓘ
- ☆50Updated 4 months ago
- Inference of Mamba models in pure C☆176Updated 6 months ago
- A pipeline for LLM knowledge distillation☆68Updated last month
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆38Updated 3 months ago
- Video+code lecture on building nanoGPT from scratch☆64Updated 3 months ago
- Gpu benchmark☆35Updated 2 weeks ago
- 1.58 Bit LLM on Apple Silicon using MLX☆97Updated 4 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆21Updated 2 months ago
- Fast parallel LLM inference for MLX☆118Updated 2 months ago
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆103Updated 5 months ago
- ☆101Updated 6 months ago
- LLM training in simple, raw C/CUDA☆17Updated 4 months ago
- Train your own small bitnet model☆47Updated 3 months ago
- Inference Llama 2 in C++☆47Updated 4 months ago
- ☆85Updated 7 months ago
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆96Updated 7 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆117Updated 8 months ago
- ☆25Updated this week
- 1.58-bit LLaMa model☆77Updated 5 months ago
- inference code for mixtral-8x7b-32kseqlen☆97Updated 9 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆40Updated 2 weeks ago
- Implementation of mamba with rust☆69Updated 6 months ago
- a simplified version of Google's Gemma model to be used for learning☆22Updated 6 months ago
- ☆109Updated last month
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆46Updated 5 months ago
- Low-Rank adapter extraction for fine-tuned transformers model☆154Updated 4 months ago
- Eh, simple and works.☆27Updated 9 months ago
- Token Omission Via Attention☆118Updated 7 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆39Updated 8 months ago
- Self-hosted LLM chatbot arena, with yourself as the only judge☆36Updated 7 months ago