关于自建AI推理引擎的手册,从0开始你需要知道的所有事情
☆273Sep 8, 2022Updated 3 years ago
Alternatives and similar repositories for AI-Infer-Engine-From-Zero
Users that are interested in AI-Infer-Engine-From-Zero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆484Oct 23, 2024Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- SGEMM optimization with cuda step by step☆22Mar 23, 2024Updated 2 years ago
- Make a minimal OpenCV runable on any where, WIP☆87Jan 16, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- how to optimize some algorithm in cuda.☆2,872Mar 17, 2026Updated last week
- 《Machine Learning Systems: Design and Implementation》 (V2 is launching soon)☆4,781Mar 15, 2026Updated last week
- 校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library st…☆3,370Jun 22, 2025Updated 9 months ago
- A simple neural network inference framework☆25Aug 1, 2023Updated 2 years ago
- row-major matmul optimization☆712Feb 24, 2026Updated last month
- Example of SenseCraft Model Assistant Model deployment related to ESP32☆32Apr 9, 2025Updated 11 months ago
- 一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework☆1,767Mar 15, 2026Updated last week
- 使用 cutlass 仓库在 ada 架构上 实现 fp8 的 flash attention☆81Aug 12, 2024Updated last year
- bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码☆34Aug 12, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆10Jul 18, 2024Updated last year
- 分层解耦的深度学习推理引擎☆79Feb 17, 2025Updated last year
- 跟着Tensorrt_pro学习各种知识☆39Nov 25, 2022Updated 3 years ago
- ncnn和pnnx格式编辑器☆137Oct 7, 2024Updated last year
- ⚡️ Using NNIE as simple as using ncnn ⚡️☆184Jan 26, 2022Updated 4 years ago
- FastSAM 部署版本,便于移植不同平,部署简单、运行速度快。☆24May 30, 2024Updated last year
- A library for high performance deep learning inference on NVIDIA GPUs.☆554Jan 29, 2022Updated 4 years ago
- 搜藏的希望的代码片段☆13Jun 6, 2023Updated 2 years ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,789Mar 28, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆44Feb 27, 2025Updated last year
- 🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉☆4,391Mar 19, 2026Updated last week
- ☆25Aug 27, 2021Updated 4 years ago
- rknn inference☆48Mar 7, 2022Updated 4 years ago
- Tutorials for writing high-performance GPU operators in AI frameworks.☆135Aug 12, 2023Updated 2 years ago
- A primitive library for neural network☆1,367Nov 24, 2024Updated last year
- compiler learning resources collect.☆2,693Mar 19, 2025Updated last year
- YoloV8 segmentation NPU for the RK 3566/68/88☆17Apr 30, 2024Updated last year
- NART = NART is not A RunTime, a deep learning inference framework.☆37Mar 2, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆2,001Jul 29, 2023Updated 2 years ago
- YOLOv3、YOLOv4、YOLOv5、YOLOv5-Lite、YOLOv6-v1、YOLOv6-v2、YOLOv7、YOLOX、YOLOX-Lite、PP-YOLOE、PP-PicoDet-Plus、YOLO-Fastest v2、FastestDet、YOLOv5-S…☆766Oct 25, 2022Updated 3 years ago
- ggml学习笔记,ggml是一个机器学习的推理框架☆18Mar 24, 2024Updated 2 years ago
- YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite > UF2☆21Oct 15, 2024Updated last year
- A cross-platform framework that deploys and applies ModelAssistant models to microcontrol devices☆42Mar 5, 2026Updated 2 weeks ago
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆43Oct 20, 2023Updated 2 years ago