tensorflow / custom-op
Guide for building custom op for TensorFlow
☆378Updated last year
Alternatives and similar repositories for custom-op:
Users that are interested in custom-op are comparing it to the libraries listed below
- A performant and modular runtime for TensorFlow☆759Updated last month
- TVM integration into PyTorch☆452Updated 5 years ago
- common in-memory tensor structure☆959Updated last week
- A tensor-aware point-to-point communication primitive for machine learning☆255Updated 2 years ago
- ☆580Updated 6 years ago
- tensorflow源码阅读笔记☆190Updated 6 years ago
- Dive into Deep Learning Compiler☆647Updated 2 years ago
- ☆408Updated last week
- The Tensor Algebra SuperOptimizer for Deep Learning☆704Updated 2 years ago
- A profiling and performance analysis tool for TensorFlow☆369Updated this week
- [Deprecated] The TensorFlow Profiler (TFProf) UI provides a visual interface for profiling TensorFlow models.☆136Updated 5 years ago
- Python bindings for NVTX☆66Updated last year
- A CUDNN minimal deep learning training code sample using LeNet.☆264Updated last year
- TensorFlow-nGraph bridge☆136Updated 4 years ago
- TensorFlow Estimator☆301Updated last year
- HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training☆979Updated this week
- TensorFlow examples in C, C++, Go and Python without bazel but with cmake and FindTensorFlow.cmake☆439Updated 5 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆293Updated last year
- Place for meetup slides☆140Updated 4 years ago
- High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.☆533Updated 2 years ago
- Enhanced networking support for TensorFlow. Maintained by SIG-networking.☆98Updated 3 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆979Updated 6 months ago
- ☆372Updated 7 years ago
- Collective communications library with various primitives for multi-machine training.☆1,277Updated this week
- Symbolic Expression and Statement Module for new DSLs☆205Updated 4 years ago
- Documentation for StreamExecutor open source proposal☆83Updated 8 years ago
- Running BERT without Padding☆472Updated 3 years ago
- High performance distributed framework for training deep learning recommendation models based on PyTorch.☆402Updated this week
- Large Model Support in Tensorflow☆202Updated 4 years ago
- Computation using data flow graphs for scalable machine learning☆67Updated this week