LLM inference engine from scratch — paged KV cache, continuous batching, chunked prefill, prefix caching, speculative decoding, CUDA graph, tensor parallelism, OpenAI-compatible serving
☆264Apr 24, 2026Updated 2 months ago
Alternatives and similar repositories for mini-infer
Users that are interested in mini-infer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 社交平台表情包收集☆102Jun 15, 2026Updated last week
- ☆108Mar 23, 2026Updated 3 months ago
- 这是一个高一学生在AI辅助下编写的极速排序算法,具有自适应等功能,已经达到工业化标准☆72Jan 24, 2026Updated 5 months ago
- geo-cultural-encoding☆74Jan 6, 2026Updated 5 months ago
- SwiftUI Front-End Design Skills — Six Ironclad Rules Against AI Sloppiness, Design Direction Consulting, Brand Asset Guidelines, and Five…☆134May 1, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Programming Massively Parallel Processors (4th Ed.) 大规模并行处理器程序设计、学习笔记、练习题解答与 CUDA 实现☆205May 11, 2026Updated last month
- M-Cube (M³) — Multi-thinking, Multimodal, Multi-verification Patent Drafting Assistant☆202May 23, 2026Updated last month
- MCP Server & Config Manager for 14 AI Clients — Cursor, VS Code, Claude Code, Gemini CLI, Windsurf, Zed, TRAE, Kiro, JetBrains & more. 85…☆172Updated this week
- Data and Codes for Our Paper "PEOD: A Pixel-Aligned Event-RGB Benchmark for Object Detection under Challenging Conditions"☆162May 25, 2026Updated last month
- Terminal-first AI assistant for software engineering tasks (inspired by Claude Code v2.0.67)☆191Jun 1, 2026Updated 3 weeks ago
- Modern Online Judge system with secure code execution and live coding battles. Powered by Golang☆71May 18, 2026Updated last month
- A music API built with Deno for searching, streaming, and exploring music data from YouTube Music, YouTube, and Last.fm.☆226May 12, 2026Updated last month
- Open-source behavioral Sybil attack detection for blockchain networks☆76Mar 23, 2026Updated 3 months ago
- 一个简易的桌面agent应用☆47Mar 9, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Muxify is a VSCode extension that allows you to visually manage tmux sessions, windows, and panes directly from the sidebar - no need to …☆94Feb 1, 2026Updated 4 months ago
- 基于Go-Zero实现的若依服务端脚手架,提供了完整的权限系统、多租户支持、RBAC 权限控制、菜单管理等功能,适合快速搭建企业级后台管理系统。☆223May 8, 2026Updated last month
- CS 21-26☆77Mar 15, 2026Updated 3 months ago
- A lightweight Go agent that collects host-level system metrics and pushes them to the Yomins monitoring stack.☆75May 1, 2026Updated last month
- Kakobuy Spreadsheet features 3,000+ trending products from Weidian, Taobao, and 1688, with affordable new arrivals added daily. Exp…☆143Mar 21, 2026Updated 3 months ago
- TideDesk 是一个面向内容运营与知识归档的自动化工作台,支持绑定多个 X 账号,抓取推荐、热点与搜索内容,完成去重归档、分类标签整理、AI 自动分析、周月报生成,以及一键分发到微信公众号、知乎、CSDN 等平台,帮助个人或团队把信息流沉淀为可管理、可复用、可发布的内容…☆51Mar 24, 2026Updated 3 months ago
- A Modern, Ad Free And Simple Anime Watching Site☆101Jun 18, 2026Updated last week
- ☆56Updated this week
- Farmer Agent on BSC Chain: farm-themed AI agent dashboard, cloud console, airdrop festival, and deployment stack.☆67Apr 21, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Controllable, Reproducible, Evaluable Agent Platform☆211Updated this week
- 🌐 My personal website.☆81Jun 14, 2026Updated 2 weeks ago
- [ICLR 2026] "DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing" (Official Implementation)☆159Mar 4, 2026Updated 3 months ago
- 基于CLIProxy开发的客户端应用-霖君☆17Feb 6, 2026Updated 4 months ago
- Universal EAS local builder with configurable Kotlin versions and auto-fixes.☆118Aug 6, 2025Updated 10 months ago
- 一个基于 Next.js App Router 的 Web3 学习 / 实验前端项目,用来练习钱包连接、链上查询、简单转账等常见场景☆98Jan 30, 2026Updated 4 months ago
- Coze MCP and Skill Management for OpenClaw☆97Mar 11, 2026Updated 3 months ago
- MathLens 是一个专注于数学题目视频讲解的 Agent Skill。你只需粘贴一道数学题(图片或文字),它就能自动完成从题目分析、可视化讲解、配音脚本到 Manim 动画视频的全流程制作。单条视频1-10 分钟,成本 0.2-1 元以内。☆373Mar 10, 2026Updated 3 months ago
- 云图 - 极简风格的云图库,支持NAS部署,支持设置密钥,支持各 种灵活的API开放接口,NAS图床,PicGo插件直接安装使用☆827Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Stop reading logs. Start watching them. MermaidTrace is a specialized logging tool that automatically generates Mermaid JS sequence diag…☆83Mar 6, 2026Updated 3 months ago
- Pluggable role definitions for AI coding agents — one command turns Claude Code / Cursor / OpenCode / Codex into a specialized profession…☆45Mar 28, 2026Updated 3 months ago
- ☆78May 4, 2026Updated last month
- 表情包生成插件☆67Apr 23, 2026Updated 2 months ago
- ☆121Mar 3, 2026Updated 3 months ago
- AI 小说推文自动化 - 小说一键转短视频(有声书+AI配图),适用于抖音/小红书☆235May 25, 2026Updated last month
- 一款思维导图工具,AI自动按照总结,归纳,第一性原理等思维方式思考,生成思维导图☆101Feb 13, 2026Updated 4 months ago