Documentation for StreamExecutor open source proposal
☆83Mar 28, 2016Updated 9 years ago
Alternatives and similar repositories for streamexecutordoc
Users that are interested in streamexecutordoc are comparing it to the libraries listed below
Sorting:
- ☆371Oct 23, 2017Updated 8 years ago
- tensorflow源码阅读笔记☆191Sep 18, 2018Updated 7 years ago
- Simple example of implementing a new Tensorflow operation and its gradient in C++.☆56Mar 28, 2019Updated 6 years ago
- This repo is used to assess NSL's scientific research assistants.☆18Jul 7, 2025Updated 8 months ago
- Minimal numerical computation library with TensorFlow APIs☆304Jan 2, 2019Updated 7 years ago
- An IR for efficiently simulating distributed ML computation.☆32Jan 13, 2024Updated 2 years ago
- Compiler toolkit for neuFlow.☆26Jul 7, 2013Updated 12 years ago
- ☆23Apr 25, 2023Updated 2 years ago
- GPU-specialized parameter server for GPU machine learning.☆102Apr 5, 2018Updated 7 years ago
- TF2 implementation of DLRM (inherited and modified from openrec's initial implementation)☆15Jul 14, 2020Updated 5 years ago
- ☆423Feb 24, 2026Updated 3 weeks ago
- Haskell binding for Menoh DNN inference library☆12Nov 30, 2018Updated 7 years ago
- A performant and modular runtime for TensorFlow☆753Sep 4, 2025Updated 6 months ago
- DLPack for Tensorflow☆35Apr 13, 2020Updated 5 years ago
- Enhanced networking support for TensorFlow. Maintained by SIG-networking.☆98Nov 19, 2021Updated 4 years ago
- Installation scripts for CUDA, cuDNN, TensorFlow, Caffe, etc. on Ubuntu machines☆24Aug 1, 2021Updated 4 years ago
- BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.☆921Dec 30, 2024Updated last year
- Reed-Solomon Erasure Coding in Haskell☆23Jan 22, 2017Updated 9 years ago
- OFI Programmer's Guide☆52Dec 29, 2022Updated 3 years ago
- FFT for PyCuda and PyOpenCL. The package is deprecated and its functionality is merged into Reikna.☆37Feb 17, 2014Updated 12 years ago
- The Tensor Algebra SuperOptimizer for Deep Learning☆740Jan 26, 2023Updated 3 years ago
- Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets☆28Apr 22, 2016Updated 9 years ago
- TensorFlow and TVM integration☆36Apr 27, 2020Updated 5 years ago
- A benchmark framework for Tensorflow☆1,145Oct 6, 2023Updated 2 years ago
- Heterogeneous Active Messages C++ library☆21Nov 8, 2019Updated 6 years ago
- Collective communications library with various primitives for multi-machine training.☆1,405Mar 11, 2026Updated last week
- TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together☆64May 22, 2018Updated 7 years ago
- CUDA Waste is a wrapper for emulation of CUDA programs on Windows☆15Feb 17, 2016Updated 10 years ago
- Voice from TUNA☆18Dec 10, 2018Updated 7 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,003Sep 19, 2024Updated last year
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆75Dec 11, 2020Updated 5 years ago
- ☆601Apr 6, 2018Updated 7 years ago
- TensorFlow kernels for probing memory☆15Mar 2, 2017Updated 9 years ago
- MPI bindings for Haskell☆46Apr 1, 2023Updated 2 years ago
- heterogeneity-aware-lowering-and-optimization☆257Jan 20, 2024Updated 2 years ago
- A self-contained computer stack hobby project☆15Dec 23, 2016Updated 9 years ago
- Chaos: Scale-out Graph Processing from Secondary Storage☆51Mar 14, 2016Updated 10 years ago
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆300Nov 28, 2018Updated 7 years ago
- It is open source ebook about TensorFlow kernel and implementation mechanism.☆2,897May 5, 2023Updated 2 years ago