nod-ai / SRT
Nod.ai ๐ฆ version of ๐ป . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository for mainline development. This repository houses branches and configuration that aren't ready for commit upstream.
โ106Updated last week
Alternatives and similar repositories for SRT:
Users that are interested in SRT are comparing it to the libraries listed below
- benchmarking some transformer deploymentsโ26Updated last year
- โ50Updated 5 months ago
- Home for OctoML PyTorch Profilerโ107Updated last year
- Training neural networks in TensorFlow 2.0 with 5x less memoryโ130Updated 2 years ago
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024โ175Updated 9 months ago
- 3X speedup over Appleโs TensorFlow plugin by using Apache TVM on M1โ135Updated 2 years ago
- extensible collectives library in tritonโ76Updated 3 months ago
- The Foundation for All Legate Librariesโ202Updated 3 weeks ago
- โ48Updated 10 months ago
- PyTorch RFCs (experimental)โ131Updated 4 months ago
- โ12Updated 3 years ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")โ291Updated this week
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.โ148Updated 2 years ago
- A library to analyze PyTorch traces.โ324Updated last month
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindโฆโ152Updated last month
- Unified compiler/runtime for interfacing with PyTorch Dynamo.โ99Updated this week
- โ99Updated 2 months ago
- OpenAI Triton backend for Intelยฎ GPUsโ154Updated this week
- A tracing JIT compiler for PyTorchโ12Updated 3 years ago
- Backward compatible ML compute opset inspired by HLO/MHLOโ428Updated this week
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters iโฆโ179Updated last month
- A Data-Centric Compiler for Machine Learningโ82Updated last year
- Customized matrix multiplication kernelsโ53Updated 2 years ago
- A library of GPU kernels for sparse matrix operations.โ251Updated 4 years ago
- โ64Updated 2 months ago
- MLIR-based partitioning systemโ56Updated this week
- Stores documents and resources used by the OpenXLA developer communityโ113Updated 5 months ago
- Fast sparse deep learning on CPUsโ51Updated 2 years ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.โ238Updated this week
- Development repository for the Triton language and compilerโ102Updated this week