LLM inference engine from scratch — paged KV cache, continuous batching, chunked prefill, prefix caching, speculative decoding, CUDA graph, tensor parallelism, OpenAI-compatible serving
☆260Apr 24, 2026Updated last month
Alternatives and similar repositories for mini-infer
Users that are interested in mini-infer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 社交平台表情包收集☆110Updated this week
- ☆116Mar 23, 2026Updated 2 months ago
- 这是一个高一学生在AI辅助下编写的极速排序算法,具有自适应等功能,已经达到工业化标准☆80Jan 24, 2026Updated 4 months ago
- SwiftUI Front-End Design Skills — Six Ironclad Rules Against AI Sloppiness, Design Direction Consulting, Brand Asset Guidelines, and Five…☆125May 1, 2026Updated last month
- M-Cube (M³) — Multi-thinking, Multimodal, Multi-verification Patent Drafting Assistant☆200May 23, 2026Updated 2 weeks ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- geo-cultural-encoding☆82Jan 6, 2026Updated 5 months ago
- MCP Server & Config Manager for 14 AI Clients — Cursor, VS Code, Claude Code, Gemini CLI, Windsurf, Zed, TRAE, Kiro, JetBrains & more. 85…☆158May 19, 2026Updated 2 weeks ago
- Programming Massively Parallel Processors (4th Ed.) 大规模并行处理器程序设计、学习笔记、练习题解答与 CUDA 实现☆203May 11, 2026Updated 3 weeks ago
- Data and Codes for Our Paper "PEOD: A Pixel-Aligned Event-RGB Benchmark for Object Detection under Challenging Conditions"☆160May 25, 2026Updated 2 weeks ago
- Terminal-first AI assistant for software engineering tasks (inspired by Claude Code v2.0.67)☆180Jun 1, 2026Updated last week
- Modern Online Judge system with secure code execution and live coding battles. Powered by Golang☆79May 18, 2026Updated 2 weeks ago
- A music API built with Deno for searching, streaming, and exploring music data from YouTube Music, YouTube, and Last.fm.☆231May 12, 2026Updated 3 weeks ago
- Open-source behavioral Sybil attack detection for blockchain networks☆86Mar 23, 2026Updated 2 months ago
- 一个简易的桌面agent应用☆47Mar 9, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Muxify is a VSCode extension that allows you to visually manage tmux sessions, windows, and panes directly from the sidebar - no need to …☆103Feb 1, 2026Updated 4 months ago
- 基于Go-Zero实现的若依服务端脚手架,提供了完整的权限系统、多租户支持、RBAC 权限控制、菜单管理等功能,适合快速搭建企业级后台管理系统。☆226May 8, 2026Updated last month
- CS 21-26☆85Mar 15, 2026Updated 2 months ago
- A lightweight Go agent that collects host-level system metrics and pushes them to the Yomins monitoring stack.☆85May 1, 2026Updated last month
- ☆68Updated this week
- Kakobuy Spreadsheet features 3,000+ trending products from Weidian, Taobao, and 1688, with affordable new arrivals added daily. Exp…☆147Mar 21, 2026Updated 2 months ago
- TideDesk 是一个面向内容运营与知识归档的自动化工作台,支持绑定多个 X 账号,抓取推荐、热点与搜索内容,完成去重归档、分类标签整理、AI 自动分析、周月报生成,以及一键分发到微信公众号、知乎、CSDN 等平台,帮助个人或团队把信息流沉淀为可管理、可复用、可发布的内容…☆45Mar 24, 2026Updated 2 months ago
- A Modern, Ad Free And Simple Anime Watching Site☆106May 27, 2026Updated last week
- Controllable, Reproducible, Evaluable Agent Platform☆215Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Farmer Agent on BSC Chain: farm-themed AI agent dashboard, cloud console, airdrop festival, and deployment stack.☆76Apr 21, 2026Updated last month
- [ICLR 2026] "DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing" (Official Implementation)☆158Mar 4, 2026Updated 3 months ago
- 🌐 My personal website.☆88Updated this week
- 基于CLIProxy开发的客户端应用-霖君☆17Feb 6, 2026Updated 4 months ago
- Universal EAS local builder with configurable Kotlin versions and auto-fixes.☆127Aug 6, 2025Updated 10 months ago
- 一个基于 Next.js App Router 的 Web3 学习 / 实验前端项目,用来练习钱包连接、链上查询、简单转账等常见场景☆107Jan 30, 2026Updated 4 months ago
- Coze MCP and Skill Management for OpenClaw☆99Mar 11, 2026Updated 2 months ago
- MathLens 是一个专注于数学题目视频讲解的 Agent Skill。你只需粘贴一道数学题(图片或文字),它就能自动完成从题目分析、可视化讲解、配音脚本到 Manim 动画视频的全流程制作。单条视频1-10 分钟,成本 0.2-1 元以内。☆378Mar 10, 2026Updated 2 months ago
- 云图 - 极简风格的云图库,支持NAS部署,支持设置密钥,支持各种灵活的API开放接口,NAS图床,PicGo插件直接安装使用☆793May 18, 2026Updated 3 weeks ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Stop reading logs. Start watching them. MermaidTrace is a specialized logging tool that automatically generates Mermaid JS sequence diag…☆82Mar 6, 2026Updated 3 months ago
- Pluggable role definitions for AI coding agents — one command turns Claude Code / Cursor / OpenCode / Codex into a specialized profession…☆45Mar 28, 2026Updated 2 months ago
- ☆87May 4, 2026Updated last month
- ☆128Mar 3, 2026Updated 3 months ago
- AI 小说推文自动化 - 小说一键转短视频(有声书+AI配图),适用于抖音/小红书☆236May 25, 2026Updated 2 weeks ago
- 一款思维导图工具,AI自动按照总结,归纳,第一性原理等思维方式思考,生成思维导图☆109Feb 13, 2026Updated 3 months ago
- ☆63Dec 31, 2025Updated 5 months ago