关于自建AI推理引擎的手册,从0开始你需要知道的所有事情
☆274Sep 8, 2022Updated 3 years ago
Alternatives and similar repositories for AI-Infer-Engine-From-Zero
Users that are interested in AI-Infer-Engine-From-Zero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆484Oct 23, 2024Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- SGEMM optimization with cuda step by step☆22Mar 23, 2024Updated 2 years ago
- Make a minimal OpenCV runable on any where, WIP☆87Jan 16, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- how to optimize some algorithm in cuda.☆2,910Apr 1, 2026Updated last week
- 《Machine Learning Systems: Design and Implementation》 (V2 is launching soon)☆4,798Mar 15, 2026Updated 3 weeks ago
- 校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library st…☆3,396Jun 22, 2025Updated 9 months ago
- A simple neural network inference framework☆25Aug 1, 2023Updated 2 years ago
- row-major matmul optimization☆714Feb 24, 2026Updated last month
- Example of SenseCraft Model Assistant Model deployment related to ESP32☆33Apr 9, 2025Updated last year
- 一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework☆1,782Updated this week
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆82Aug 12, 2024Updated last year
- bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码☆34Aug 12, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆10Jul 18, 2024Updated last year
- 分层解耦的深度学习推理引擎☆79Feb 17, 2025Updated last year
- 跟着Tensorrt_pro学习各种知识☆39Nov 25, 2022Updated 3 years ago
- ncnn和pnnx格式编辑器☆138Oct 7, 2024Updated last year
- ⚡️ Using NNIE as simple as using ncnn ⚡️☆184Jan 26, 2022Updated 4 years ago
- FastSAM 部署版本,便于移植不同平,部署简单、运行速度快。☆24May 30, 2024Updated last year
- A library for high performance deep learning inference on NVIDIA GPUs.☆554Jan 29, 2022Updated 4 years ago
- 搜藏的希望的代码片段☆13Jun 6, 2023Updated 2 years ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,789Mar 28, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉☆4,398Mar 19, 2026Updated 3 weeks ago
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆45Feb 27, 2025Updated last year
- ☆25Aug 27, 2021Updated 4 years ago
- rknn inference☆48Mar 7, 2022Updated 4 years ago
- Tutorials for writing high-performance GPU operators in AI frameworks.☆135Aug 12, 2023Updated 2 years ago
- A primitive library for neural network☆1,368Nov 24, 2024Updated last year
- YoloV8 segmentation NPU for the RK 3566/68/88☆18Apr 30, 2024Updated last year
- compiler learning resources collect.☆2,707Mar 19, 2025Updated last year
- NART = NART is not A RunTime, a deep learning inference framework.☆37Mar 2, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆2,006Jul 29, 2023Updated 2 years ago
- YOLOv3、YOLOv4、YOLOv5、YOLOv5-Lite、YOLOv6-v1、YOLOv6-v2、YOLOv7、YOLOX、YOLOX-Lite、PP-YOLOE、PP-PicoDet-Plus、YOLO-Fastest v2、FastestDet、YOLOv5-S…☆767Oct 25, 2022Updated 3 years ago
- Quantize yolov5 using pytorch_quantization.🚀🚀🚀☆14Oct 24, 2023Updated 2 years ago
- ggml学习笔记,ggml是一个机器学习的推理框架☆18Mar 24, 2024Updated 2 years ago
- YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite > UF2☆21Oct 15, 2024Updated last year
- A cross-platform framework that deploys and applies ModelAssistant models to microcontrol devices☆42Mar 5, 2026Updated last month
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year