syoyo/safetensors-cpp

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/syoyo/safetensors-cpp)

syoyo / safetensors-cpp

Header-only safetensors loader and saver in C++

☆89

Alternatives and similar repositories for safetensors-cpp

Users that are interested in safetensors-cpp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

carsonpo / safetensors.cpp
View on GitHub
Zero Dependency LibTorch Safetensors Loading and Storing in C++
☆23Jul 12, 2024Updated 2 years ago
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆18May 22, 2024Updated 2 years ago
lovemefan / ggml-learning-notes
View on GitHub
ggml学习笔记，ggml是一个机器学习的推理框架
☆18Mar 24, 2024Updated 2 years ago
pierrel55 / llama_st
View on GitHub
Load and run Llama from safetensors files in C
☆15Oct 24, 2024Updated last year
AXERA-TECH / OWLVIT-ONNX-AX650-CPP
View on GitHub
☆23Jan 3, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
wangzhaode / onnx-llm
View on GitHub
llm deploy project based onnx.
☆49Oct 9, 2024Updated last year
mailliw2010 / infer-frame
View on GitHub
a ai infra framework for edge device base on nndeploy
☆18Nov 27, 2025Updated 7 months ago
triple-mu / TensorRT2ONNX
View on GitHub
A tool convert TensorRT engine/plan to a fake onnx
☆41Nov 22, 2022Updated 3 years ago
EdVince / model_zoo
View on GitHub
Recording models
☆12Sep 19, 2023Updated 2 years ago
AXERA-TECH / SAM-ONNX-AX650-CPP
View on GitHub
☆18Dec 7, 2023Updated 2 years ago
ChunelFeng / CGraph-lite
View on GitHub
A one-page-only CGraph-API-liked DAG project.
☆28Feb 11, 2025Updated last year
Infrasys-AI / aiinfra-docs
View on GitHub
☆21Nov 6, 2025Updated 8 months ago
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 6 months ago
wangzhaode / jinja.cpp
View on GitHub
A lightweight, single-header C++11 Jinja2 template engine for LLM chat templates.
☆20Mar 4, 2026Updated 4 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
yuxiaoranyu / stable_diffusion_trt_triton
View on GitHub
☆20Dec 29, 2023Updated 2 years ago
Dominic23331 / rtmpose_tensorrt
View on GitHub
☆22Apr 10, 2024Updated 2 years ago
Peter-Chou / transformer_cpp_tokenizers
View on GitHub
transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)
☆18Apr 7, 2022Updated 4 years ago
fabio-sim / DocShadow-ONNX-TensorRT
View on GitHub
ONNX-compatible DocShadow: High-Resolution Document Shadow Removal. Supports TensorRT 🚀
☆25Sep 13, 2023Updated 2 years ago
AXERA-TECH / CLIP-ONNX-AX650-CPP
View on GitHub
☆29Jun 30, 2025Updated last year
zhangjikai / parallel
View on GitHub
并行计算学习笔记
☆44Feb 25, 2017Updated 9 years ago
QwenLM / qwen.cpp
View on GitHub
C++ implementation of Qwen-LM
☆627Dec 6, 2024Updated last year
wangzhaode / mnn-stable-diffusion
View on GitHub
stable diffusion using mnn
☆68Sep 28, 2023Updated 2 years ago
delta1037 / RknnInferTemplate
View on GitHub
RKNN模型推理部署模板
☆24Aug 11, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
OpenPPL / ppl.llm.kernel.cuda
View on GitHub
☆150Jan 9, 2025Updated last year
SNU-ARC / OpenDNN
View on GitHub
OpenDNN: An Open-source, cuDNN-like Deep Learning Primitive Library
☆29Dec 9, 2019Updated 6 years ago
tanjatang / snpe_resnet
View on GitHub
snpe tutorial
☆10Dec 25, 2023Updated 2 years ago
mlc-ai / tokenizers-cpp
View on GitHub
Universal cross-platform tokenizers binding to HF and sentencepiece
☆497May 20, 2026Updated 2 months ago
gigit0000 / qwen3.c
View on GitHub
Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.
☆25Sep 1, 2025Updated 10 months ago
caiwanxianhust / CUDA-BLOG
View on GitHub
存放一些 CUDA 编程相关的博客文件。
☆22Oct 16, 2025Updated 9 months ago
wangzhaode / mnn-segment-anything
View on GitHub
segment-anything based mnn
☆37Dec 13, 2023Updated 2 years ago
vicalloy / docker-images
View on GitHub
Dockerfiles for poetry/mlc-llm(rk3588)/...
☆10Sep 13, 2023Updated 2 years ago
yvonwin / qwen2.cpp
View on GitHub
qwen2 and llama3 cpp implementation
☆50Jun 7, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
PKU-SEC-Lab / Jetson-PI-Edge
View on GitHub
☆16Updated this week
wangzhaode / mnn-asr
View on GitHub
mnn asr demo.
☆27Mar 24, 2025Updated last year
daquexian / web-model-converter
View on GitHub
☆42Nov 29, 2022Updated 3 years ago
modelscope / dash-infer
View on GitHub
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆273Aug 6, 2025Updated 11 months ago
ZHEQIUSHUI / SAM-ONNX-AX650-CPP
View on GitHub
SAM and lama inpaint，包含QT的GUI交互界面，实现了交互式可实时显示结果的画点、画框进行SAM，然后通过进行Inpaint，具体操作看readme里的视频。
☆54Jan 30, 2024Updated 2 years ago
staghado / vit.cpp
View on GitHub
Inference Vision Transformer (ViT) in plain C/C++ with ggml
☆318Apr 11, 2024Updated 2 years ago
weishengying / cutlass_flash_atten_fp8
View on GitHub
使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention
☆82Aug 12, 2024Updated last year