deepjavalibrary / djl-serving
A universal scalable machine learning model deployment solution
☆192Updated this week
Related projects: ⓘ
- ☆94Updated this week
- Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.☆193Updated this week
- ☆170Updated this week
- The Triton TensorRT-LLM Backend☆654Updated this week
- Examples on how to use LangChain and Ray☆217Updated last year
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆95Updated this week
- Large Language Model Hosting Container☆75Updated 2 weeks ago
- LLMPerf is a library for validating and benchmarking LLMs☆578Updated 3 weeks ago
- Example code for AWS Neuron SDK developers building inference and training applications☆120Updated 2 weeks ago
- ☆39Updated this week
- This repository contains tutorials and examples for Triton Inference Server☆527Updated this week
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆523Updated this week
- ☆411Updated 10 months ago
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆231Updated last week
- Scalable data pre processing and curation toolkit for LLMs☆461Updated this week
- NIM Agent Blueprint for multimodal PDF extraction☆29Updated 2 weeks ago
- Foundation Model Evaluations Library☆184Updated 3 weeks ago
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆715Updated last month
- Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.☆177Updated this week
- Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.☆547Updated this week
- A high-performance inference system for large language models, designed for production environments.☆370Updated last week
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆419Updated last week
- Common source, scripts and utilities for creating Triton backends.☆280Updated last week
- The Triton backend for the ONNX Runtime.☆122Updated this week
- Hands-on workshop for distributed training and hosting on SageMaker☆118Updated this week
- Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at h…☆134Updated 3 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆178Updated last week
- experiments with inference on llama☆106Updated 3 months ago
- ☆235Updated 3 weeks ago
- Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stac…☆175Updated this week