nod-ai / SRT
Nod.ai π¦ version of π» . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository for mainline development. This repository houses branches and configuration that aren't ready for commit upstream.
β106Updated last month
Alternatives and similar repositories for SRT:
Users that are interested in SRT are comparing it to the libraries listed below
- benchmarking some transformer deploymentsβ26Updated last year
- Home for OctoML PyTorch Profilerβ107Updated last year
- β12Updated 3 years ago
- β51Updated 6 months ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.β100Updated this week
- Customized matrix multiplication kernelsβ53Updated 2 years ago
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.β148Updated 2 years ago
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters iβ¦β178Updated 2 months ago
- A Data-Centric Compiler for Machine Learningβ82Updated last year
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β154Updated 2 months ago
- Training neural networks in TensorFlow 2.0 with 5x less memoryβ130Updated 2 years ago
- MLIR-based partitioning systemβ62Updated this week
- β48Updated 11 months ago
- Benchmarks to capture important workloads.β29Updated 3 weeks ago
- Implementation of a Transformer, but completely in Tritonβ257Updated 2 years ago
- A library of GPU kernels for sparse matrix operations.β255Updated 4 years ago
- 3X speedup over Appleβs TensorFlow plugin by using Apache TVM on M1β135Updated 2 years ago
- Stores documents and resources used by the OpenXLA developer communityβ116Updated 6 months ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.β127Updated last year
- npcomp - An aspirational MLIR based numpy compilerβ51Updated 4 years ago
- β284Updated last week
- Backward compatible ML compute opset inspired by HLO/MHLOβ446Updated last week
- An open-source efficient deep learning framework/compiler, written in python.β681Updated last week
- An experimental CPU backend for Triton (https//github.com/openai/triton)β38Updated 9 months ago
- Development repository for the Triton language and compilerβ107Updated this week
- High-Performance SGEMM on CUDA devicesβ74Updated 3 weeks ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.β107Updated 3 weeks ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")β303Updated this week
- A tracing JIT compiler for PyTorchβ12Updated 3 years ago
- An Aspiring Drop-In Replacement for Pandas at Scaleβ75Updated 3 years ago