deepjavalibrary / djl-serving
A universal scalable machine learning model deployment solution
☆213Updated this week
Alternatives and similar repositories for djl-serving:
Users that are interested in djl-serving are comparing it to the libraries listed below
- ☆104Updated 2 months ago
- Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.☆222Updated this week
- Demo applications showcasing DJL☆328Updated last week
- ☆238Updated last week
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆465Updated 2 weeks ago
- Example code for AWS Neuron SDK developers building inference and training applications☆140Updated last month
- Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.☆613Updated last week
- Common source, scripts and utilities for creating Triton backends.☆311Updated last week
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆784Updated last month
- This repository contains tutorials and examples for Triton Inference Server☆673Updated last week
- LLMPerf is a library for validating and benchmarking LLMs☆845Updated 3 months ago
- Hands-on workshop for distributed training and hosting on SageMaker☆133Updated last month
- Large Language Model Hosting Container☆85Updated 2 weeks ago
- The Triton TensorRT-LLM Backend☆812Updated this week
- Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.☆597Updated this week
- The Triton backend for the ONNX Runtime.☆140Updated 2 weeks ago
- ☆52Updated last month
- The Java implementation of Dive into Deep Learning (D2L.ai)☆181Updated last week
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆289Updated 2 months ago
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆61Updated last week
- Fast Inference Solutions for BLOOM☆562Updated 5 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆199Updated 2 months ago
- ☆252Updated this week
- Foundation Model Evaluations Library☆239Updated 3 weeks ago
- Serve machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.☆402Updated last year
- Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and i…☆501Updated this week
- Examples on how to use LangChain and Ray☆226Updated last year
- Use the two different methods (deepspeed and SageMaker model parallelism library) to fine tune llama model on Sagemaker. Then deploy the …☆23Updated last year
- A TensorFlow Serving solution for use in SageMaker. This repo is now deprecated.☆173Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Updated this week