nod-ai / SRTLinks
Nod.ai π¦ version of π» . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository for mainline development. This repository houses branches and configuration that aren't ready for commit upstream.
β107Updated last month
Alternatives and similar repositories for SRT
Users that are interested in SRT are comparing it to the libraries listed below
Sorting:
- benchmarking some transformer deploymentsβ26Updated last month
- A tracing JIT compiler for PyTorchβ13Updated 4 years ago
- PyTorch interface for the IPUβ181Updated 2 years ago
- 3X speedup over Appleβs TensorFlow plugin by using Apache TVM on M1β137Updated 3 years ago
- Customized matrix multiplication kernelsβ57Updated 3 years ago
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.β149Updated 3 years ago
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters iβ¦β182Updated last month
- β55Updated last year
- The Foundation for All Legate Librariesβ233Updated this week
- Home for OctoML PyTorch Profilerβ114Updated 2 years ago
- Automatically insert nvtx ranges to PyTorch modelsβ22Updated 4 years ago
- PyTorch RFCs (experimental)β137Updated 8 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.β46Updated last year
- A Data-Centric Compiler for Machine Learningβ85Updated last month
- Training neural networks in TensorFlow 2.0 with 5x less memoryβ137Updated 3 years ago
- GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compreβ¦β370Updated 2 weeks ago
- A tensor-aware point-to-point communication primitive for machine learningβ283Updated last month
- β50Updated last year
- Benchmarks to capture important workloads.β32Updated 2 weeks ago
- A library for syntactically rewriting Python programs, pronounced (sinner).β67Updated 3 years ago
- Torch Distributed Experimentalβ117Updated last year
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β164Updated 2 weeks ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)β48Updated 5 months ago
- π Interactive performance profiling and debugging tool for PyTorch neural networks.β64Updated last year
- β13Updated 4 years ago
- Implementation of a Tensorflow XLA rematerialization passβ15Updated 6 years ago
- Notes and artifacts from the ONNX steering committeeβ28Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.β327Updated 3 weeks ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large β¦β65Updated 3 years ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.β130Updated last month