nod-ai / SRTLinks

Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository for mainline development. This repository houses branches and configuration that aren't ready for commit upstream.

☆106

Alternatives and similar repositories for SRT

Users that are interested in SRT are comparing it to the libraries listed below

Sorting:

nod-ai / transformer-benchmarks
benchmarking some transformer deployments
☆26Updated 2 years ago
nunoplopes / torchy
A tracing JIT compiler for PyTorch
☆13Updated 3 years ago
graphcore / poptorch
PyTorch interface for the IPU
☆181Updated 2 years ago
DeMoriarty / custom_matmul_kernels
Customized matrix multiplication kernels
☆57Updated 3 years ago
facebookresearch / loop_tool
A thin, highly portable toolkit for efficiently compiling dense loop-based computation.
☆148Updated 2 years ago
iree-org / iree-jax
☆52Updated last year
meta-pytorch / multipy
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…
☆181Updated 2 months ago
octoml / Apple-M1-BERT
3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1
☆138Updated 3 years ago
facebookresearch / FAMBench
Benchmarks to capture important workloads.
☆31Updated 9 months ago
jax-ml / ml_dtypes
A stand-alone implementation of several NumPy dtype extensions used in machine learning.
☆305Updated this week
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆114Updated 2 years ago
zdevito / custom_loader
☆13Updated 4 years ago
marsupialtail / sparsednn
Fast sparse deep learning on CPUs
☆56Updated 3 years ago
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆136Updated 3 years ago
meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated last month
IntelLabs / SLIDE_opt_ia
☆74Updated last year
facebookresearch / dietgpu
GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compre…
☆354Updated this week
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
pytorch / tensorpipe
A tensor-aware point-to-point communication primitive for machine learning
☆274Updated 2 months ago
nv-legate / legate
The Foundation for All Legate Libraries
☆229Updated this week
pytorch / rfcs
PyTorch RFCs (experimental)
☆135Updated 5 months ago
google / jaxonnxruntime
A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.
☆123Updated last month
fabiocannizzo / FastBinarySearch
Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers
☆152Updated 10 months ago
spcl / daceml
A Data-Centric Compiler for Machine Learning
☆85Updated last year
UmerHA / triton_util
Make triton easier
☆48Updated last year
graphcore / tutorials
Training material for IPU users: tutorials, feature examples, simple applications
☆87Updated 2 years ago
CentML / DeepView.Profile
🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.
☆64Updated 9 months ago
amd / ZenDNN
☆126Updated this week
onnx / steering-committee
Notes and artifacts from the ONNX steering committee
☆26Updated last week
sdpython / onnxcustom
Tutorial on how to convert machine learned models into ONNX
☆16Updated 2 years ago