mlcommons / mobile_modelsLinks
MLPerf™ Mobile models
☆26Updated 9 months ago
Alternatives and similar repositories for mobile_models
Users that are interested in mobile_models are comparing it to the libraries listed below
Sorting:
- ☆21Updated 3 years ago
- Machine Intelligence Shader Autogen. AMDGPU ML shader code generator. (previously iGEMMgen)☆34Updated last week
- ☆69Updated 2 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- GEMM and Winograd based convolutions using CUTLASS☆26Updated 5 years ago
- TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together☆64Updated 7 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆43Updated 4 months ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- A lightweight, Pythonic, frontend for MLIR☆80Updated last year
- ☆22Updated last year
- CNNs in Halide☆23Updated 9 years ago
- Yet another Polyhedra Compiler for DeepLearning☆19Updated 2 years ago
- ☆11Updated 4 years ago
- A profiler to disclose and quantify hardware features on GPUs.☆172Updated 3 years ago
- Sandbox for TVM and playing around!☆22Updated 2 years ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆100Updated last week
- cuASR: CUDA Algebra for Semirings☆36Updated 2 years ago
- A Data-Centric Compiler for Machine Learning☆84Updated last year
- AMD's graph optimization engine.☆230Updated this week
- A Winograd Minimal Filter Implementation in CUDA☆25Updated 3 years ago
- Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research☆109Updated last year
- ☆61Updated this week
- ONNX Command-Line Toolbox☆35Updated 9 months ago
- Library for fast image convolution in neural networks on Intel Architecture☆30Updated 8 years ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆133Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆55Updated 3 months ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆65Updated this week
- CUDA Template Functions☆19Updated 7 months ago
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆77Updated 3 months ago
- This is a demo how to write a high performance convolution run on apple silicon☆54Updated 3 years ago