maxbbraun / llama4microLinks
A "large" language model running on a microcontroller
☆531Updated last year
Alternatives and similar repositories for llama4micro
Users that are interested in llama4micro are comparing it to the libraries listed below
Sorting:
- Fine-tune mistral-7B on 3090s, a100s, h100s☆714Updated last year
- a small code base for training large models☆301Updated 2 months ago
- LLM-powered lossless compression tool☆281Updated 10 months ago
- Visualize the intermediate output of Mistral 7B☆364Updated 5 months ago
- Mistral7B playing DOOM☆132Updated 11 months ago
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆567Updated last year
- Llama 2 Everywhere (L2E)☆1,519Updated 5 months ago
- run paligemma in real time☆131Updated last year
- TinyChatEngine: On-Device LLM Inference Library☆865Updated 11 months ago
- An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning☆168Updated last year
- The repository for the code of the UltraFastBERT paper☆516Updated last year
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆345Updated 10 months ago
- Instructions on how to run LLMs on Raspberry PI☆208Updated 10 months ago
- Finetune llama2-70b and codellama on MacBook Air without quantization☆447Updated last year
- LLaVA server (llama.cpp).☆180Updated last year
- ☆864Updated last year
- WebGPU LLM inference tuned by hand☆151Updated 2 years ago
- Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.☆446Updated 4 months ago
- SoTA Transformers with C-backend for fast inference on your CPU.☆309Updated last year
- llama.cpp with BakLLaVA model describes what does it see☆383Updated last year
- Alex Krizhevsky's original code from Google Code☆193Updated 9 years ago
- Open weights language model from Google DeepMind, based on Griffin.☆641Updated 3 weeks ago
- Run GGML models with Kubernetes.☆173Updated last year
- Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript☆584Updated 11 months ago
- LLM-based code completion engine☆193Updated 5 months ago
- C++ implementation for BLOOM☆810Updated 2 years ago
- An implementation of bucketMul LLM inference☆218Updated 11 months ago
- ☆415Updated last year
- MiniLLM is a minimal system for running modern LLMs on consumer-grade GPUs☆913Updated 2 years ago
- A tool to analyze and debug neural networks in pytorch. Use a GUI to traverse the computation graph and view the data from many different…☆287Updated 6 months ago