psmarter/mini-infer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/psmarter/mini-infer)

psmarter / mini-infer

LLM inference engine from scratch — paged KV cache, continuous batching, chunked prefill, prefix caching, speculative decoding, CUDA graph, tensor parallelism, OpenAI-compatible serving

☆264

Alternatives and similar repositories for mini-infer

Users that are interested in mini-infer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Augenstern-O / Stickers
View on GitHub
社交平台表情包收集
☆102Jun 15, 2026Updated last week
olcc78 / start-to-study
View on GitHub
☆108Mar 23, 2026Updated 3 months ago
CLRV-FYX / fyx_sort
View on GitHub
这是一个高一学生在AI辅助下编写的极速排序算法，具有自适应等功能，已经达到工业化标准
☆72Jan 24, 2026Updated 5 months ago
zhizibianjie-omniedge / geo-cultural-encoding
View on GitHub
geo-cultural-encoding
☆74Jan 6, 2026Updated 5 months ago
Wholiver / swiftui-design-skill
View on GitHub
SwiftUI Front-End Design Skills — Six Ironclad Rules Against AI Sloppiness, Design Direction Consulting, Brand Asset Guidelines, and Five…
☆134May 1, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
psmarter / PMPP-Learning
View on GitHub
Programming Massively Parallel Processors (4th Ed.) 大规模并行处理器程序设计、学习笔记、练习题解答与 CUDA 实现
☆205May 11, 2026Updated last month
yycyyv / M-Cube
View on GitHub
M-Cube (M³) — Multi-thinking, Multimodal, Multi-verification Patent Drafting Assistant
☆202May 23, 2026Updated last month
OldJii / mcp-dock
View on GitHub
MCP Server & Config Manager for 14 AI Clients — Cursor, VS Code, Claude Code, Gemini CLI, Windsurf, Zed, TRAE, Kiro, JetBrains & more. 85…
☆172Updated this week
bupt-ai-cz / PEOD
View on GitHub
Data and Codes for Our Paper "PEOD: A Pixel-Aligned Event-RGB Benchmark for Object Detection under Challenging Conditions"
☆162May 25, 2026Updated last month
yusifeng / formax
View on GitHub
Terminal-first AI assistant for software engineering tasks (inspired by Claude Code v2.0.67)
☆191Jun 1, 2026Updated 3 weeks ago
AQADIL / JudGO
View on GitHub
Modern Online Judge system with secure code execution and live coding battles. Powered by Golang
☆71May 18, 2026Updated last month
Kirazul / Verome-API
View on GitHub
A music API built with Deno for searching, streaming, and exploring music data from YouTube Music, YouTube, and Last.fm.
☆226May 12, 2026Updated last month
KOKOSde / onchain-sybil-detector
View on GitHub
Open-source behavioral Sybil attack detection for blockchain networks
☆76Mar 23, 2026Updated 3 months ago
TobisawaAyamaru / omniAgent
View on GitHub
一个简易的桌面agent应用
☆47Mar 9, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wangtao2001 / Muxify
View on GitHub
Muxify is a VSCode extension that allows you to visually manage tmux sessions, windows, and panes directly from the sidebar - no need to …
☆94Feb 1, 2026Updated 4 months ago
cls-cloud / ovra-zero
View on GitHub
基于Go-Zero实现的若依服务端脚手架，提供了完整的权限系统、多租户支持、RBAC 权限控制、菜单管理等功能，适合快速搭建企业级后台管理系统。
☆223May 8, 2026Updated last month
Ahmad-beast / Al_based_Real-time_Traffic_managment_system
View on GitHub
CS 21-26
☆77Mar 15, 2026Updated 3 months ago
yominsops / yomins-agent
View on GitHub
A lightweight Go agent that collects host-level system metrics and pushes them to the Yomins monitoring stack.
☆75May 1, 2026Updated last month
kakobuyspreadsheet2026 / Kakobuy-Sugargoo-ACbuy-OOPbuy-Superbuy-Spreadsheet-2026
View on GitHub
Kakobuy Spreadsheet features 3,000+ trending products from Weidian, Taobao, and 1688, with affordable new arrivals added daily. Exp…
☆143Mar 21, 2026Updated 3 months ago
MuShan-bit / TideDesk
View on GitHub
TideDesk 是一个面向内容运营与知识归档的自动化工作台，支持绑定多个 X 账号，抓取推荐、热点与搜索内容，完成去重归档、分类标签整理、AI 自动分析、周月报生成，以及一键分发到微信公众号、知乎、CSDN 等平台，帮助个人或团队把信息流沉淀为可管理、可复用、可发布的内容…
☆51Mar 24, 2026Updated 3 months ago
AnimePub / AniPub
View on GitHub
A Modern, Ad Free And Simple Anime Watching Site
☆101Jun 18, 2026Updated last week
cybermonkjbot / lifemanager
View on GitHub
☆56Updated this week
FarmerAgentbsc / farmer-agent-app
View on GitHub
Farmer Agent on BSC Chain: farm-themed AI agent dashboard, cloud console, airdrop festival, and deployment stack.
☆67Apr 21, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tyxben / arcana
View on GitHub
Controllable, Reproducible, Evaluable Agent Platform
☆211Updated this week
lostf1sh / lostf1sh.github.io
View on GitHub
🌐 My personal website.
☆81Jun 14, 2026Updated 2 weeks ago
Edennnnnnnnnn / DragFlow
View on GitHub
[ICLR 2026] "DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing" (Official Implementation)
☆159Mar 4, 2026Updated 3 months ago
wangsan0 / LinJun
View on GitHub
基于CLIProxy开发的客户端应用-霖君
☆17Feb 6, 2026Updated 4 months ago
Samirtemtem / universal-eas-local-builder
View on GitHub
Universal EAS local builder with configurable Kotlin versions and auto-fixes.
☆118Aug 6, 2025Updated 10 months ago
Lsq128 / web3-lab
View on GitHub
一个基于 Next.js App Router 的 Web3 学习 / 实验前端项目，用来练习钱包连接、链上查询、简单转账等常见场景
☆98Jan 30, 2026Updated 4 months ago
danielcy / coze-mcp-for-openclaw
View on GitHub
Coze MCP and Skill Management for OpenClaw
☆97Mar 11, 2026Updated 3 months ago
shuyicc / MathLens
View on GitHub
MathLens 是一个专注于数学题目视频讲解的 Agent Skill。你只需粘贴一道数学题（图片或文字），它就能自动完成从题目分析、可视化讲解、配音脚本到 Manim 动画视频的全流程制作。单条视频1-10 分钟，成本 0.2-1 元以内。
☆373Mar 10, 2026Updated 3 months ago
qazzxxx / cloudimgs
View on GitHub
云图 - 极简风格的云图库，支持NAS部署，支持设置密钥，支持各种灵活的API开放接口，NAS图床，PicGo插件直接安装使用
☆827Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xt765 / mermaid-trace
View on GitHub
Stop reading logs. Start watching them. MermaidTrace is a specialized logging tool that automatically generates Mermaid JS sequence diag…
☆83Mar 6, 2026Updated 3 months ago
0xranx / agentbrief
View on GitHub
Pluggable role definitions for AI coding agents — one command turns Claude Code / Cursor / OpenCode / Codex into a specialized profession…
☆45Mar 28, 2026Updated 3 months ago
PKU-PCNI / WiFo-CF
View on GitHub
☆78May 4, 2026Updated last month
SodaSizzle / astrbot_plugin_meme_generator
View on GitHub
表情包生成插件
☆67Apr 23, 2026Updated 2 months ago
Live-GalGame / narrarc
View on GitHub
☆121Mar 3, 2026Updated 3 months ago
tyxben / AI_novel
View on GitHub
AI 小说推文自动化 - 小说一键转短视频（有声书+AI配图），适用于抖音/小红书
☆235May 25, 2026Updated last month
linkxzhou / SimpleMind
View on GitHub
一款思维导图工具，AI自动按照总结，归纳，第一性原理等思维方式思考，生成思维导图
☆101Feb 13, 2026Updated 4 months ago