maxbbraun / llama4microLinks
A "large" language model running on a microcontroller
☆528Updated last year
Alternatives and similar repositories for llama4micro
Users that are interested in llama4micro are comparing it to the libraries listed below
Sorting:
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆567Updated last year
- llama.cpp with BakLLaVA model describes what does it see☆383Updated last year
- ☆1,025Updated last year
- CLIP inference in plain C/C++ with no extra dependencies☆499Updated 9 months ago
- SoTA Transformers with C-backend for fast inference on your CPU.☆309Updated last year
- gpt-2 from scratch in mlx☆387Updated 11 months ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆287Updated last year
- A reinforcement learning framework based on MLX.☆233Updated 3 months ago
- run paligemma in real time☆131Updated last year
- Simple Byte pair Encoding mechanism used for tokenization process . written purely in C☆132Updated 6 months ago
- LLM papers I'm reading, mostly on inference and model compression☆730Updated last year
- Instructions on how to run LLMs on Raspberry PI☆207Updated 10 months ago
- a small code base for training large models☆300Updated last month
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆345Updated 10 months ago
- Llama 2 Everywhere (L2E)☆1,517Updated 4 months ago
- A really tiny autograd engine☆94Updated last week
- An implementation of bucketMul LLM inference☆217Updated 11 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆713Updated last year
- ☆243Updated last year
- LLaVA server (llama.cpp).☆179Updated last year
- LLM-based code completion engine☆188Updated 4 months ago
- Efficient Inference of Transformer models☆433Updated 10 months ago
- A modern model graph visualizer and debugger☆1,212Updated last week
- TinyChatEngine: On-Device LLM Inference Library☆854Updated 11 months ago
- WebGPU LLM inference tuned by hand☆150Updated last year
- ☆98Updated last year
- Mistral7B playing DOOM☆132Updated 10 months ago
- throwaway GPT inference☆139Updated last year
- Inference of Mamba models in pure C☆187Updated last year
- FlashAttention (Metal Port)☆492Updated 8 months ago