mlcommons / mlperf_clientLinks
MLPerf Client is a benchmark for Windows, Linux and macOS, focusing on client form factors in ML inference scenarios.
☆73Updated 2 months ago
Alternatives and similar repositories for mlperf_client
Users that are interested in mlperf_client are comparing it to the libraries listed below
Sorting:
- No-code CLI designed for accelerating ONNX workflows☆227Updated 7 months ago
- Transformer GPU VRAM estimator☆68Updated last year
- AMD related optimizations for transformer models☆97Updated 3 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- Repository of model demos using TT-Buda☆63Updated 10 months ago
- Train, tune, and infer Bamba model☆137Updated 8 months ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 6 months ago
- LLM training in simple, raw C/HIP for AMD GPUs☆58Updated last year
- Self-host LLMs with vLLM and BentoML☆168Updated 2 weeks ago
- Intel® AI Super Builder☆156Updated 2 weeks ago
- LLM inference in C/C++☆104Updated last week
- A collection of all available inference solutions for the LLMs☆94Updated 11 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆203Updated 4 months ago
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆428Updated this week
- python package of rocm-smi-lib☆24Updated last month
- ScalarLM - a unified training and inference stack☆97Updated 2 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆113Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆68Updated last week
- An innovative library for efficient LLM inference via low-bit quantization☆352Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆50Updated this week
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆92Updated this week
- Inference server benchmarking tool☆142Updated 4 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆161Updated 4 months ago
- High-Performance FP32 GEMM on CUDA devices☆117Updated last year
- ☆137Updated last week
- ☆95Updated this week
- llama.cpp to PyTorch Converter☆37Updated last year
- Building blocks for agents in C++☆139Updated 2 weeks ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆63Updated 4 months ago
- A command-line interface tool for serving LLM using vLLM.☆468Updated 2 weeks ago