microsoft / onnxruntime-extensions
onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime
☆375Updated this week
Alternatives and similar repositories for onnxruntime-extensions:
Users that are interested in onnxruntime-extensions are comparing it to the libraries listed below
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆341Updated this week
- Common utilities for ONNX converters☆266Updated 4 months ago
- ONNX Optimizer☆696Updated 3 weeks ago
- Examples for using ONNX Runtime for model training.☆332Updated 6 months ago
- The Triton backend for the ONNX Runtime.☆140Updated last week
- A parser, editor and profiler tool for ONNX models.☆425Updated 3 months ago
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆472Updated this week
- Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure☆843Updated this week
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆199Updated 3 months ago
- Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.☆619Updated this week
- Common source, scripts and utilities for creating Triton backends.☆316Updated last week
- Accelerate PyTorch models with ONNX Runtime☆359Updated 2 months ago
- Transform ONNX model to PyTorch representation☆332Updated 5 months ago
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,040Updated last year
- Generative AI extensions for onnxruntime☆693Updated this week
- Examples for using ONNX Runtime for machine learning inferencing.☆1,354Updated last week
- LLaMa/RWKV onnx models, quantization and testcase☆361Updated last year
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆132Updated this week
- Universal cross-platform tokenizers binding to HF and sentencepiece☆323Updated last week
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆786Updated 2 months ago
- TensorRT Plugin Autogen Tool☆370Updated 2 years ago
- A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.☆1,473Updated 2 months ago
- A Toolkit to Help Optimize Onnx Model☆140Updated this week
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆603Updated last week
- Model compression for ONNX☆91Updated 5 months ago
- nvidia-modelopt is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculat…☆870Updated last week
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆180Updated 4 months ago
- The Triton backend for TensorRT.☆73Updated this week
- Convert ONNX models to PyTorch.☆664Updated 8 months ago
- The Triton backend for the PyTorch TorchScript models.☆146Updated this week