triton-inference-server / pytorch_backendLinks

The Triton backend for the PyTorch TorchScript models.

☆160

Alternatives and similar repositories for pytorch_backend

Users that are interested in pytorch_backend are comparing it to the libraries listed below

Sorting:

triton-inference-server / onnxruntime_backend
The Triton backend for the ONNX Runtime.
☆162Updated 2 weeks ago
triton-inference-server / model_navigator
Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.
☆213Updated 6 months ago
meta-pytorch / multipy
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…
☆180Updated last month
triton-inference-server / tensorrt_backend
The Triton backend for TensorRT.
☆79Updated 2 weeks ago
triton-inference-server / model_analyzer
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…
☆494Updated 2 weeks ago
meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated last month
pytorch / rfcs
PyTorch RFCs (experimental)
☆135Updated 4 months ago
triton-inference-server / backend
Common source, scripts and utilities for creating Triton backends.
☆352Updated 2 weeks ago
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
anyscale / llm-continuous-batching-benchmarks
☆121Updated last year
HabanaAI / Model-References
Reference models for Intel(R) Gaudi(R) AI Accelerator
☆165Updated last month
meta-pytorch / torchx
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…
☆395Updated last week
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆114Updated 2 years ago
NVIDIA / LDDL
Distributed preprocessing and data loading for language datasets
☆39Updated last year
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆276Updated 3 years ago
gpuopenanalytics / pynvml
Provide Python access to the NVML library for GPU diagnostics
☆248Updated last month
pytorch / ort
Accelerate PyTorch models with ONNX Runtime
☆365Updated 8 months ago
triton-inference-server / vllm_backend
☆302Updated last week
triton-inference-server / triton_cli
Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…
☆70Updated 2 weeks ago
deepspeedai / DeepSpeed-Kernels
☆72Updated 6 months ago
facebookresearch / HolisticTraceAnalysis
A library to analyze PyTorch traces.
☆416Updated last week
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆215Updated last week
neuralmagic / AutoFP8
☆205Updated 5 months ago
NetEase-FuXi / EETQ
Easy and Efficient Quantization for Transformers
☆202Updated 4 months ago
NVIDIA / Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆357Updated last week
microsoft / batch-inference
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
☆102Updated last year
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆120Updated 10 months ago
pytorch / test-infra
This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …
☆102Updated this week
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆273Updated 2 months ago
meta-pytorch / applied-ai
Applied AI experiments and examples for PyTorch
☆299Updated 2 months ago