maxbbraun / llama4micro
A "large" language model running on a microcontroller
☆507Updated last year
Alternatives and similar repositories for llama4micro:
Users that are interested in llama4micro are comparing it to the libraries listed below
- LLaVA server (llama.cpp).☆176Updated last year
- run paligemma in real time☆129Updated 8 months ago
- Mistral7B playing DOOM☆127Updated 6 months ago
- throwaway GPT inference☆140Updated 7 months ago
- llama3.np is a pure NumPy implementation for Llama 3 model.☆972Updated 7 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆180Updated 8 months ago
- TinyChatEngine: On-Device LLM Inference Library☆797Updated 6 months ago
- A modern model graph visualizer and debugger☆1,109Updated this week
- Efficient Inference of Transformer models☆414Updated 5 months ago
- The repository for the code of the UltraFastBERT paper☆514Updated 10 months ago
- a small code base for training large models☆283Updated last month
- The Tensor (or Array)☆420Updated 5 months ago
- An implementation of bucketMul LLM inference☆215Updated 6 months ago
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆250Updated last year
- A really tiny autograd engine☆89Updated 9 months ago
- (WIP) A small but powerful, homemade PyTorch from scratch.☆516Updated this week
- ☆180Updated 5 months ago
- Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but…☆1,899Updated last week
- Open weights language model from Google DeepMind, based on Griffin.☆614Updated 6 months ago
- GGUF implementation in C as a library and a tools CLI program☆251Updated 3 weeks ago
- ☆238Updated 10 months ago
- Algebraic enhancements for GEMM & AI accelerators☆263Updated this week
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆562Updated last year
- If tinygrad wasn't small enough for you...☆677Updated 10 months ago
- ☆1,268Updated last year
- llama.cpp with BakLLaVA model describes what does it see☆380Updated last year
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆126Updated 5 months ago
- C++ implementation for BLOOM☆810Updated last year
- CLIP inference in plain C/C++ with no extra dependencies☆475Updated 5 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆706Updated last year