Documentation for StreamExecutor open source proposal
☆83Mar 28, 2016Updated 9 years ago
Alternatives and similar repositories for streamexecutordoc
Users that are interested in streamexecutordoc are comparing it to the libraries listed below
Sorting:
- ☆371Oct 23, 2017Updated 8 years ago
- tensorflow源码阅读笔记☆191Sep 18, 2018Updated 7 years ago
- Documentation for the entire CGRAFlow☆19Sep 17, 2021Updated 4 years ago
- This repo is used to assess NSL's scientific research assistants.☆18Jul 7, 2025Updated 8 months ago
- Minimal numerical computation library with TensorFlow APIs☆304Jan 2, 2019Updated 7 years ago
- An IR for efficiently simulating distributed ML computation.☆32Jan 13, 2024Updated 2 years ago
- ☆23Apr 25, 2023Updated 2 years ago
- GPU-specialized parameter server for GPU machine learning.☆102Apr 5, 2018Updated 7 years ago
- TF2 implementation of DLRM (inherited and modified from openrec's initial implementation)☆15Jul 14, 2020Updated 5 years ago
- ☆423Feb 24, 2026Updated 3 weeks ago
- Haskell binding for Menoh DNN inference library☆12Nov 30, 2018Updated 7 years ago
- Towards Hardware and Software Continuous Integration☆13Jun 8, 2020Updated 5 years ago
- A performant and modular runtime for TensorFlow☆753Sep 4, 2025Updated 6 months ago
- Enhanced networking support for TensorFlow. Maintained by SIG-networking.☆98Nov 19, 2021Updated 4 years ago
- A fast multi-producer, multi-consumer lock-free concurrent queue for C++11☆10May 25, 2015Updated 10 years ago
- BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.☆921Dec 30, 2024Updated last year
- Reed-Solomon Erasure Coding in Haskell☆23Jan 22, 2017Updated 9 years ago
- OFI Programmer's Guide☆52Dec 29, 2022Updated 3 years ago
- FFT for PyCuda and PyOpenCL. The package is deprecated and its functionality is merged into Reikna.☆37Feb 17, 2014Updated 12 years ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆740Jan 26, 2023Updated 3 years ago
- Proof-of-Concept CNN in Halide☆22Aug 4, 2016Updated 9 years ago
- Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets☆28Apr 22, 2016Updated 9 years ago
- TensorFlow and TVM integration☆36Apr 27, 2020Updated 5 years ago
- A benchmark framework for Tensorflow☆1,145Oct 6, 2023Updated 2 years ago
- Collective communications library with various primitives for multi-machine training.☆1,405Mar 11, 2026Updated last week
- TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together☆64May 22, 2018Updated 7 years ago
- CUDA Waste is a wrapper for emulation of CUDA programs on Windows☆15Feb 17, 2016Updated 10 years ago
- Voice from TUNA☆18Dec 10, 2018Updated 7 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,003Sep 19, 2024Updated last year
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆75Dec 11, 2020Updated 5 years ago
- ☆601Apr 6, 2018Updated 7 years ago
- MPI bindings for Haskell☆46Apr 1, 2023Updated 2 years ago
- heterogeneity-aware-lowering-and-optimization☆257Jan 20, 2024Updated 2 years ago
- ☆12Feb 5, 2023Updated 3 years ago
- A self-contained computer stack hobby project☆15Dec 23, 2016Updated 9 years ago
- Chaos: Scale-out Graph Processing from Secondary Storage☆51Mar 14, 2016Updated 10 years ago
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆300Nov 28, 2018Updated 7 years ago
- It is open source ebook about TensorFlow kernel and implementation mechanism.☆2,895May 5, 2023Updated 2 years ago
- An MLIR frontend for tensor expressions☆24Sep 5, 2020Updated 5 years ago