mlcommons / mlperf_clientLinks
MLPerf Client is a benchmark for Windows, Linux and macOS, focusing on client form factors in ML inference scenarios.
☆67Updated last month
Alternatives and similar repositories for mlperf_client
Users that are interested in mlperf_client are comparing it to the libraries listed below
Sorting:
- No-code CLI designed for accelerating ONNX workflows☆221Updated 7 months ago
- Transformer GPU VRAM estimator☆67Updated last year
- AMD related optimizations for transformer models☆96Updated 2 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated last week
- Intel® AI Assistant Builder☆140Updated this week
- Train, tune, and infer Bamba model☆137Updated 7 months ago
- A collection of all available inference solutions for the LLMs☆94Updated 10 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆202Updated 3 months ago
- AI Tensor Engine for ROCm☆330Updated this week
- An innovative library for efficient LLM inference via low-bit quantization☆351Updated last year
- Benchmark and optimize LLM inference across frameworks with ease☆153Updated 3 months ago
- LLM training in simple, raw C/HIP for AMD GPUs☆57Updated last year
- ☆114Updated last week
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 5 months ago
- Inference server benchmarking tool☆135Updated 3 months ago
- ☆219Updated 11 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆113Updated last week
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆294Updated this week
- LLM inference in C/C++☆104Updated 3 weeks ago
- Repository of model demos using TT-Buda☆63Updated 9 months ago
- Efficient non-uniform quantization with GPTQ for GGUF☆58Updated 3 months ago
- ScalarLM - a unified training and inference stack☆94Updated last month
- Fully Open Language Models with Stellar Performance☆312Updated last month
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆96Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆66Updated 3 weeks ago
- Self-host LLMs with vLLM and BentoML☆163Updated last month
- High-Performance SGEMM on CUDA devices☆115Updated 11 months ago
- ☆690Updated last week
- ☆83Updated last month
- Code for paper "Analog Foundation Models"☆27Updated 3 months ago