maxbbraun / llama4micro
A "large" language model running on a microcontroller
☆525Updated last year
Alternatives and similar repositories for llama4micro:
Users that are interested in llama4micro are comparing it to the libraries listed below
- Llama 2 Everywhere (L2E)☆1,517Updated 3 months ago
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆345Updated 8 months ago
- C++ implementation for BLOOM☆810Updated last year
- GGUF implementation in C as a library and a tools CLI program☆268Updated 3 months ago
- run paligemma in real time☆131Updated 11 months ago
- Mistral7B playing DOOM☆131Updated 9 months ago
- An implementation of bucketMul LLM inference☆216Updated 9 months ago
- An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning☆166Updated last year
- Inference Llama 2 in one file of pure Python☆415Updated 6 months ago
- a small code base for training large models☆291Updated 4 months ago
- Tensor library for machine learning☆278Updated 2 years ago
- Running a LLM on the ESP32☆292Updated 7 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆174Updated 8 months ago
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆567Updated last year
- A really tiny autograd engine☆92Updated last year
- ☆97Updated last year
- llama.cpp with BakLLaVA model describes what does it see☆383Updated last year
- llama3.np is a pure NumPy implementation for Llama 3 model.☆981Updated 10 months ago
- Open weights language model from Google DeepMind, based on Griffin.☆635Updated 2 months ago
- throwaway GPT inference☆138Updated 10 months ago
- ☆143Updated 2 years ago
- Simple Byte pair Encoding mechanism used for tokenization process . written purely in C☆129Updated 5 months ago
- Inference of Mamba models in pure C☆187Updated last year
- Let's make sand talk☆590Updated last year
- ☆412Updated last year
- gpt-2 from scratch in mlx☆382Updated 10 months ago
- The repository for the code of the UltraFastBERT paper☆518Updated last year
- Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but…☆1,939Updated this week
- fast vector database made in numpy☆751Updated 11 months ago
- LLaVA server (llama.cpp).☆179Updated last year