pierrel55 / llama_stLinks

Load and run Llama from safetensors files in C

☆12

Alternatives and similar repositories for llama_st

Users that are interested in llama_st are comparing it to the libraries listed below

Sorting:

rafacelente / bllama
1.58-bit LLaMa model
☆83Updated last year
reka-ai / rekaquant
☆62Updated 4 months ago
grctest / Electron-BitNet
Running Microsoft's BitNet via Electron, React & Astro
☆48Updated 2 months ago
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
introlix / Swiftlet
SwiftLet is a lightweight Python framework for running open-source Large Language Models (LLMs) locally using safetensors
☆28Updated 3 months ago
adriancable / qwen3.c
Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.
☆146Updated 4 months ago
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆27Updated last year
ideaweaver-ai / Tiny-Children-Stories-30M-model
☆120Updated 5 months ago
abhisheknair10 / llama3.cu
Lightweight Llama 3 8B Inference Engine in CUDA C
☆53Updated 8 months ago
gigit0000 / qwen3.c
Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.
☆18Updated 2 months ago
lazy-guy / chess-llama
Tiny Llama model trained to play chess
☆27Updated 4 months ago
nivibilla / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆68Updated last year
nyunAI / PruneGPT
☆51Updated last year
FishiaTee / Tumera
Yet another frontend for LLM, written using .NET and WinUI 3
☆10Updated 2 months ago
ideaweaver-ai / DeepSeek-Children-Stories-15M-model
☆107Updated 5 months ago
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated last year
IST-DASLab / gptq-gguf-toolkit
Efficient non-uniform quantization with GPTQ for GGUF
☆53Updated 2 months ago
jukofyork / transplant-vocab
Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.
☆46Updated last month
KevlarKanou / rwkv7.c
Inference RWKV v7 in pure C.
☆42Updated last month
hscspring / llama.np
Inference Llama/Llama2/Llama3 Modes in NumPy
☆21Updated 2 years ago
HabermannR / Fantasy-Tribe-Game
LLM backed Fantasy Tribe Game
☆19Updated last year
Antoine-Villiere / JacQues
JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.
☆22Updated last year
Cornell-RelaxML / yaqa-quantization
☆63Updated 5 months ago
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆96Updated 6 months ago
fairydreaming / farel-bench
Testing LLM reasoning abilities with family relationship quizzes.
☆63Updated 10 months ago
pranavjad / tinyllama-bitnet
Train your own small bitnet model
☆75Updated last year
foundation-model-stack / fms-model-optimizer
FMS Model Optimizer is a framework for developing reduced precision neural network models.
☆20Updated last week
hasaranga / NativeChat
win32 native frontend for llama-cli
☆12Updated last year
VatsaDev / NanoPhi-alpha
GPT-2 small trained on phi-like data
☆67Updated last year
FishiaTee / yawullm
Yet Another (LLM) Web UI, made with Gemini
☆12Updated 11 months ago