InfiniTensor/RefactorGraph

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/InfiniTensor/RefactorGraph)

InfiniTensor / RefactorGraph

分层解耦的深度学习推理引擎

☆79

Alternatives and similar repositories for RefactorGraph

Users that are interested in RefactorGraph are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InfiniTensor / InfiniTensor
View on GitHub
InfiniTensor is a high-performance inference engine tailored for GPUs and AI accelerators. Its design focuses on effective deployment and…
☆312Mar 16, 2026Updated 3 weeks ago
InfiniTensor / InfiniLM-Rust
View on GitHub
☆126Jan 22, 2026Updated 2 months ago
YdrMaster / cuda-driver
View on GitHub
基于 CUDA Driver API 的 cuda 运行时环境
☆16Jul 30, 2025Updated 8 months ago
richjjj / cuvid-tensorrt-multi
View on GitHub
ffmpeg+cuvid+tensorrt+multicamera
☆12Dec 31, 2024Updated last year
lovemefan / ggml-learning-notes
View on GitHub
ggml学习笔记，ggml是一个机器学习的推理框架
☆18Mar 24, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
KuangjuX / cu-x
View on GitHub
🎉My Collections of CUDA Kernels~
☆11Jun 25, 2024Updated last year
LearningInfiniTensor / TinyInfiniTensor
View on GitHub
☆44Jan 8, 2025Updated last year
YdrMaster / notebook
View on GitHub
笔记
☆53Aug 15, 2025Updated 7 months ago
Cjkkkk / CUDA_gemm
View on GitHub
A simple high performance CUDA GEMM implementation.
☆430Jan 4, 2024Updated 2 years ago
zjd1988 / video_pipe_c
View on GitHub
a plugin-oriented framework for video structured. 国产程序员请加微信zhzhi78拉群交流。
☆18May 28, 2024Updated last year
tlc-pack / cutlass_fpA_intB_gemm
View on GitHub
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
☆96Feb 20, 2026Updated last month
ling0322 / libllm
View on GitHub
Efficient inference of large language models.
☆150Sep 28, 2025Updated 6 months ago
ChunelFeng / CGraph-lite
View on GitHub
A one-page-only CGraph-API-liked DAG project.
☆26Feb 11, 2025Updated last year
YdrMaster / dtb-walker
View on GitHub
遍历设备树二进制对象
☆15Nov 22, 2025Updated 4 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
stemnic / rustyvisor
View on GitHub
Hypervisor written in Rust for the RISC-V 1.0 hypervisor extension
☆16Oct 21, 2024Updated last year
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆18May 22, 2024Updated last year
ZHEQIUSHUI / CLIP-ONNX-AX650-CPP
View on GitHub
c++实现的clip推理，模型有一点点改动，但是不大，改动和导出模型的代码可以在readme里找到，模型文件都在Releases里，包括AX650的模型。新增支持ChineseCLIP
☆31Jun 19, 2025Updated 9 months ago
MegEngine / MegCC
View on GitHub
MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器
☆484Oct 23, 2024Updated last year
InfiniTensor / ninetoothed
View on GitHub
A domain-specific language (DSL) based on Triton but providing higher-level abstractions.
☆125Updated this week
njuhope / cuda_sgemm
View on GitHub
☆120Apr 11, 2024Updated last year
caiwanxianhust / FasterLLaMA
View on GitHub
使用 CUDA C++ 实现的 llama 模型推理框架
☆65Nov 8, 2024Updated last year
li199603 / sgemm_with_cuda
View on GitHub
SGEMM optimization with cuda step by step
☆22Mar 23, 2024Updated 2 years ago
daquexian / web-model-converter
View on GitHub
☆42Nov 29, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
YdrMaster / llama2.rs
View on GitHub
实验：rust 实现 llama2 推理
☆17Feb 23, 2024Updated 2 years ago
leimao / Nsight-Compute-Docker-Image
View on GitHub
Nsight Compute In Docker
☆13Dec 21, 2023Updated 2 years ago
Tartisan / MMDet3d-PointPillars
View on GitHub
PointPillars TensorRT version pretrained on MMDetection3d with WaymoOpenDataset
☆23Aug 11, 2022Updated 3 years ago
raymond1123 / hgemm
View on GitHub
☆30Nov 16, 2024Updated last year
KuangjuX / Paper-reading
View on GitHub
My Paper Reading Lists and Notes.
☆21Mar 28, 2026Updated last week
dianhsu / transformer-cpp-cpu
View on GitHub
用C++实现一个简单的Transformer模型。 Attention Is All You Need。
☆54Mar 11, 2021Updated 5 years ago
eth-cscs / Tiled-MM
View on GitHub
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
☆32Apr 2, 2025Updated last year
torchpipe / torchpipe
View on GitHub
Serving Inside Pytorch
☆171Feb 3, 2026Updated 2 months ago
InfiniTensor / InfiniLM
View on GitHub
☆68Updated this week
NordVPN Special Discount Offer • Ad
Save on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
AXERA-TECH / OWLVIT-ONNX-AX650-CPP
View on GitHub
☆23Jan 3, 2024Updated 2 years ago
LearningInfiniTensor / learning-cxx
View on GitHub
☆77Jan 25, 2025Updated last year
Bruce-Lee-LY / cutlass_gemm
View on GitHub
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Aug 3, 2025Updated 8 months ago
InfiniTensor / operators
View on GitHub
算子库
☆17Jul 9, 2025Updated 9 months ago
inferflow / inferflow
View on GitHub
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
☆250Mar 15, 2024Updated 2 years ago
weishengying / cutlass_flash_atten_fp8
View on GitHub
使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention
☆82Aug 12, 2024Updated last year
InfiniTensor / InfiniCore
View on GitHub
☆53Updated this week