mlcommons / mlperf_clientLinks
MLPerf Client is a benchmark for Windows and macOS, focusing on client form factors in ML inference scenarios.
☆51Updated last week
Alternatives and similar repositories for mlperf_client
Users that are interested in mlperf_client are comparing it to the libraries listed below
Sorting:
- No-code CLI designed for accelerating ONNX workflows☆214Updated 4 months ago
- LLM inference in C/C++☆102Updated last month
- llama.cpp fork used by GPT4All☆57Updated 8 months ago
- ☆102Updated last year
- The NVIDIA RTX™ AI Toolkit is a suite of tools and SDKs for Windows developers to customize, optimize, and deploy AI models across RTX PC…☆176Updated 11 months ago
- Intel® AI Assistant Builder☆111Updated this week
- ScalarLM - a unified training and inference stack☆85Updated 2 weeks ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆65Updated last year
- ☆413Updated last week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆359Updated this week
- Self-host LLMs with vLLM and BentoML☆151Updated last week
- AMD related optimizations for transformer models☆90Updated last month
- A collection of all available inference solutions for the LLMs☆91Updated 7 months ago
- Train, tune, and infer Bamba model☆134Updated 4 months ago
- Transformer GPU VRAM estimator☆66Updated last year
- An innovative library for efficient LLM inference via low-bit quantization☆349Updated last year
- ☆264Updated 3 months ago
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.☆80Updated last week
- This reference can be used with any existing OpenAI integrated apps to run with TRT-LLM inference locally on GeForce GPU on Windows inste…☆125Updated last year
- Benchmarking the serving capabilities of vLLM☆53Updated last year
- Route LLM requests to the best model for the task at hand.☆109Updated 3 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆90Updated last week
- Benchmark and optimize LLM inference across frameworks with ease☆121Updated last month
- LLM training in simple, raw C/HIP for AMD GPUs☆51Updated last year
- ☆60Updated 4 months ago
- LLM Inference on consumer devices☆124Updated 7 months ago
- A command-line interface tool for serving LLM using vLLM.☆427Updated last month
- Simple examples using Argilla tools to build AI☆56Updated 11 months ago
- Experiments with BitNet inference on CPU☆54Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 6 months ago