分层解耦的深度学习推理引擎
☆79Feb 17, 2025Updated last year
Alternatives and similar repositories for RefactorGraph
Users that are interested in RefactorGraph are comparing it to the libraries listed below
Sorting:
- ☆289Feb 4, 2026Updated 3 weeks ago
- 基于 CUDA Driver API 的 cuda 运行时环境☆15Jul 30, 2025Updated 6 months ago
- ☆126Jan 22, 2026Updated last month
- 🎉My Collections of CUDA Kernels~☆11Jun 25, 2024Updated last year
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- A simple high performance CUDA GEMM implementation.☆426Jan 4, 2024Updated 2 years ago
- ggml学习笔记,ggml是一个机器学习的推理框架☆18Mar 24, 2024Updated last year
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆96Feb 20, 2026Updated last week
- HunyuanDiT with TensorRT and libtorch☆18May 22, 2024Updated last year
- a plugin-oriented framework for video structured. 国产程序员请加微信zhzhi78拉群交流。☆18May 28, 2024Updated last year
- c++实现的clip推理,模型有一点点改动,但是不大,改动和导出模型的代码可以在readme里找到,模型文件都在Releases里,包括AX650的模型。新增支持ChineseCLIP☆31Jun 19, 2025Updated 8 months ago
- Efficient inference of large language models.☆149Sep 28, 2025Updated 4 months ago
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆486Oct 23, 2024Updated last year
- 笔记☆51Aug 15, 2025Updated 6 months ago
- ☆43Jan 8, 2025Updated last year
- only contain face detect 、5/81 points 、face recognization models☆11Jul 9, 2020Updated 5 years ago
- PointPillars TensorRT version pretrained on MMDetection3d with WaymoOpenDataset☆22Aug 11, 2022Updated 3 years ago
- ☆23Jan 3, 2024Updated 2 years ago
- SGEMM optimization with cuda step by step☆21Mar 23, 2024Updated last year
- Nsight Compute In Docker☆13Dec 21, 2023Updated 2 years ago
- 遍历设备树二进制对象☆14Nov 22, 2025Updated 3 months ago
- A domain-specific language (DSL) based on Triton but providing higher-level abstractions.☆41Feb 4, 2026Updated 3 weeks ago
- A one-page-only CGraph-API-liked DAG project.☆26Feb 11, 2025Updated last year
- 实验:rust 实现 llama2 推理☆17Feb 23, 2024Updated 2 years ago
- 从MinerU中提取出来的文本检测识别部分,通过pytorch实现paddleocr的文本检测识别☆17Jun 2, 2025Updated 8 months ago
- ☆42Nov 29, 2022Updated 3 years ago
- Serving Inside Pytorch☆170Feb 3, 2026Updated 3 weeks ago
- A simple neural network inference framework☆25Aug 1, 2023Updated 2 years ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆250Mar 15, 2024Updated last year
- 使用 CUDA C++ 实现的 llama 模型推理框架☆64Nov 8, 2024Updated last year
- ☆30Nov 16, 2024Updated last year
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated 10 months ago
- how to learn PyTorch and OneFlow☆485Mar 22, 2024Updated last year
- ☆17Jan 1, 2024Updated 2 years ago
- ☆27Jan 7, 2025Updated last year
- Hypervisor written in Rust for the RISC-V 1.0 hypervisor extension☆16Oct 21, 2024Updated last year
- ☆25Aug 27, 2021Updated 4 years ago
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆49Feb 20, 2026Updated last week
- High Performan Ai Model Web Server. Mainly support computer vision model. Quickly establish your own ai-model server. https://github.com/…☆45May 13, 2025Updated 9 months ago