apache / tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆12,028Updated this week
Alternatives and similar repositories for tvm:
Users that are interested in tvm are comparing it to the libraries listed below
- Open standard for machine learning interoperability☆18,448Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,726Updated this week
- Compiler for Neural Network hardware accelerators☆3,273Updated 9 months ago
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,389Updated 2 weeks ago
- Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Juli…☆20,788Updated last year
- a language for fast, portable data-parallel computation☆5,970Updated this week
- Development repository for the Triton language and compiler☆14,452Updated this week
- Optimized primitives for collective multi-GPU communication☆3,463Updated 3 weeks ago
- Tutorials for creating and using ONNX models☆3,451Updated 7 months ago
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆2,963Updated this week
- Low-precision matrix multiplication☆1,792Updated last year
- NumPy & SciPy for GPU☆9,908Updated this week
- MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Co…☆5,805Updated 8 months ago
- A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used …☆16,963Updated this week
- ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator☆15,648Updated this week
- ☆1,656Updated 6 years ago
- Tutorial code on how to build your own Deep Learning System in 2k Lines☆2,008Updated 6 years ago
- AutoML library for deep learning☆9,197Updated 2 months ago
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆1,960Updated this week
- "Multi-Level Intermediate Representation" Compiler Infrastructure☆1,737Updated 3 years ago
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆8,537Updated 2 weeks ago
- A retargetable MLIR-based machine learning compiler and runtime toolkit.☆2,996Updated this week
- A high performance and generic framework for distributed DNN training☆3,662Updated last year
- A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep lear…☆5,278Updated this week
- CUDA Templates for Linear Algebra Subroutines☆6,233Updated last week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more☆31,324Updated this week
- Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on sing…☆26,601Updated this week
- Efficiently computes derivatives of NumPy code.☆7,138Updated this week
- TensorFlow's Visualization Toolkit☆6,788Updated last week
- NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source compone…☆11,184Updated 2 weeks ago