~950 line, minimal, extensible LLM inference engine built from scratch.
☆470Jan 9, 2026Updated 4 months ago
Alternatives and similar repositories for simple-llm
Users that are interested in simple-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆85May 22, 2025Updated 11 months ago
- Continual Learning Bench☆113May 5, 2026Updated 2 weeks ago
- A DSPy Adapter for exact-fidelity prompt templates with full control over messages.☆47Feb 23, 2026Updated 2 months ago
- ☆15Nov 18, 2025Updated 6 months ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated last month
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Scripts for training Qwen 2.5 VL with ms-swift and GRPO☆12Feb 27, 2025Updated last year
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆23Dec 2, 2025Updated 5 months ago
- KV Cache & LoRA for minGPT☆63Mar 4, 2026Updated 2 months ago
- ROSA-Tuning☆73Feb 4, 2026Updated 3 months ago
- CodeVibes is an intelligent AI-powered code analysis tool that scans your GitHub repositories to uncover security vulnerabilities, bugs a…☆61Jan 12, 2026Updated 4 months ago
- Compression for unit-norm embedding vectors using spherical coordinates☆81Jan 23, 2026Updated 3 months ago
- GEMV implementation with CUTLASS☆21Aug 21, 2025Updated 8 months ago
- ☆86Apr 7, 2026Updated last month
- Prompt + regex lab☆10Nov 22, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆33Jul 15, 2025Updated 10 months ago
- Storing long contexts in tiny caches with self-study☆265Mar 23, 2026Updated last month
- Example repo showcasing model training and deployment with distil claude cli skill☆55Jan 19, 2026Updated 4 months ago
- ☆39Feb 18, 2025Updated last year
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- ☆21Mar 3, 2025Updated last year
- Official Project Page for Deep Delta Learning (https://huggingface.co/papers/2601.00417)☆354May 13, 2026Updated last week
- ☆11Jan 9, 2019Updated 7 years ago
- ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture☆28Feb 3, 2026Updated 3 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- General-purpose planning and execution harness for LLMs — structured phases, critique, gating, and review☆66Updated this week
- MCP server for the X (Twitter) API -- give AI agents the ability to post, search, read, and engage on X☆44Mar 24, 2026Updated last month
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆16Aug 31, 2023Updated 2 years ago
- Adding Marimo to Datasette☆21Mar 24, 2025Updated last year
- NanoGPT (124M) in 90 seconds☆5,233May 12, 2026Updated last week
- ☆145Mar 31, 2026Updated last month
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆379Nov 15, 2025Updated 6 months ago
- Structured Generation Evals☆14Sep 25, 2024Updated last year
- ☆13Nov 30, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Repository for the implementation and evaluation of DD-GloVe, a train-time debiasing algorithm to learn GloVe word embeddings by leveragi…☆13May 29, 2022Updated 3 years ago
- A fully functional Convolutional VAE implemented in pure C from scratch.☆23Jan 19, 2026Updated 4 months ago
- A terminal-based AI assistant for Linux sysadmins.☆38Mar 20, 2026Updated 2 months ago
- [CVPR 2026] Official repo for "EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation"☆58Mar 13, 2026Updated 2 months ago
- AI-Driven Research Systems (ADRS)☆142Dec 17, 2025Updated 5 months ago
- ☆73May 5, 2026Updated 2 weeks ago
- JAX Implementations of Descript Audio Codec and EnCodec☆36Mar 30, 2025Updated last year