openvinotoolkit / mlas
☆10Updated 2 months ago
Alternatives and similar repositories for mlas:
Users that are interested in mlas are comparing it to the libraries listed below
- ONNX Script editor & visualiser running completely in the browser thanks to Pyodide and Netron☆20Updated 2 years ago
- ☆65Updated last week
- Inference TinyLlama models on ncnn☆24Updated last year
- MozoLM: A language model (LM) serving library☆44Updated last month
- Loop Nest - Linear algebra compiler and code generator.☆22Updated 2 years ago
- the C++ version of Seq2Seq with ncnn☆23Updated 3 years ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆34Updated 2 years ago
- a single-header math library☆16Updated 6 months ago
- ☆124Updated last year
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆59Updated this week
- An easy way to run, test, benchmark and tune OpenCL kernel files☆23Updated last year
- Unit Scaling demo and experimentation code☆16Updated last year
- FlexAttention w/ FlashAttention3 Support☆26Updated 6 months ago
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆27Updated this week
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆18Updated this week
- OpenVINO Tokenizers extension☆31Updated this week
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆62Updated last week
- benchmarking some transformer deployments☆26Updated 2 years ago
- int8_t and int16_t matrix multiply based on https://arxiv.org/abs/1705.01991☆69Updated last year
- A tracing JIT for PyTorch☆17Updated 2 years ago
- Snapdragon Neural Processing Engine (SNPE) SDKThe Snapdragon Neural Processing Engine (SNPE) is a Qualcomm Snapdragon software accelerate…☆34Updated 2 years ago
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆57Updated 2 weeks ago
- Acoustic Neighbor Embeddings☆21Updated 3 months ago
- Experiments with BitNet inference on CPU☆53Updated last year
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆40Updated 2 weeks ago
- ☆40Updated 2 years ago
- Standalone commandline CLI tool for compiling Triton kernels☆17Updated 6 months ago
- Whisper in TensorRT-LLM☆15Updated last year
- XLA integration of Open Neural Network Exchange (ONNX)☆19Updated 6 years ago
- Estimating hardware and cloud costs of LLMs and transformer projects☆14Updated last year