sammysun0711 / ov_llm_bench
OpenVINO LLM Benchmark
☆11Updated last year
Alternatives and similar repositories for ov_llm_bench:
Users that are interested in ov_llm_bench are comparing it to the libraries listed below
- OpenVINO Tokenizers extension☆31Updated this week
- CUDA Templates for Linear Algebra Subroutines☆16Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆62Updated 3 weeks ago
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆249Updated this week
- oneCCL Bindings for Pytorch*☆91Updated this week
- ☆61Updated 3 months ago
- ☆26Updated this week
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆21Updated 4 months ago
- ☆20Updated last week
- ☆20Updated 2 weeks ago
- ☆46Updated last week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Updated last month
- Intel® Tensor Processing Primitives extension for Pytorch*☆12Updated this week
- ☆37Updated this week
- High-Performance Linpack Benchmark adopted version for GPU backend☆11Updated 2 years ago
- OpenAI Triton backend for Intel® GPUs☆172Updated this week
- ☆58Updated 4 months ago
- Ahead of Time (AOT) Triton Math Library☆56Updated 2 weeks ago
- PArallelLOOPgEneratoR: Threaded Loops Code Generation Infrastructure targeting Tensor Contraction Applications such as GEMMs, Convolution…☆18Updated 3 months ago
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆21Updated this week
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆162Updated last month
- oneAPI Collective Communications Library (oneCCL)☆227Updated this week
- Computation using data flow graphs for scalable machine learning☆67Updated this week
- ☆16Updated this week
- Fast and efficient attention method exploration and implementation.☆19Updated last week
- ☆23Updated 2 months ago
- pytorch code examples for measuring the performance of collective communication calls in AI workloads☆16Updated 5 months ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆37Updated 8 months ago
- ☆410Updated last week
- OpenVINO backend for Triton.☆31Updated 3 weeks ago