sammysun0711 / ov_llm_bench
OpenVINO LLM Benchmark
☆11Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for ov_llm_bench
- Intel® Tensor Processing Primitives extension for Pytorch*☆10Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆11Updated last month
- ☆16Updated this week
- OpenVINO Tokenizers extension☆24Updated this week
- ☆39Updated last month
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆57Updated 2 months ago
- ☆15Updated 2 months ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated 3 weeks ago
- ☆12Updated this week
- oneCCL Bindings for Pytorch*☆86Updated 2 weeks ago
- OpenAI Triton backend for Intel® GPUs☆143Updated this week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆145Updated this week
- ☆29Updated this week
- OpenVINO backend for Triton.☆29Updated this week
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆18Updated this week
- ☆59Updated this week
- Ahead of Time (AOT) Triton Math Library☆40Updated this week
- Development repository for the Triton language and compiler☆93Updated this week
- ☆10Updated 3 months ago
- BERT for Distributed PyTorch + AMP Training☆12Updated last year
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆11Updated 4 months ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆30Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆44Updated this week
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆19Updated last week
- ☆93Updated last month
- MLPerf™ logging library☆30Updated last week
- oneAPI Technical Advisory Board (TAB) Meeting Notes☆72Updated 9 months ago
- PArallelLOOPgEneratoR: Threaded Loops Code Generation Infrastructure targeting Tensor Contraction Applications such as GEMMs, Convolution…☆18Updated last month
- ☆57Updated this week
- oneAPI Level Zero Conformance & Performance test content☆46Updated this week