sjperkins / tfopgenLinks

Generate C++ and CUDA boilerplate for tensorflow custom operators

☆20

Alternatives and similar repositories for tfopgen

Users that are interested in tfopgen are comparing it to the libraries listed below

Sorting:

yaroslavvb / memory_util
TensorFlow util for building memory usage timeline from LOG_MEMORY messages
☆65Updated 7 years ago
MethodsOfMachineLearning / cabs
Tensorflow implementation of SGD with Coupled Adaptive Batch Size (CABS)
☆44Updated 8 years ago
MycChiu / fast-LayerNorm-TF
Efficient layer normalization GPU kernel for Tensorflow
☆111Updated 8 years ago
diux-dev / ncluster
☆37Updated 6 years ago
szagoruyko / nnpack.torch
Torch FFI-bindings for NNPACK
☆30Updated 8 years ago
MatthieuCourbariaux / deep-learning-multipliers
Training deep neural networks with low precision multiplications
☆63Updated 10 years ago
loudinthecloud / dpwa
Distributed Learning by Pair-Wise Averaging
☆52Updated 7 years ago
lnsmith54 / super-convergence
Files to create the figures in the paper "Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates"
☆191Updated 7 years ago
Alexey-Kamenev / Benchmarks
Benchmarks for CNTK and other toolkits.
☆44Updated 9 years ago
FlorianMuellerklein / Identity-Mapping-ResNet-Lasagne
Reproduction of some of the results from 'Identity Mappings in Deep Residual Networks'
☆72Updated 8 years ago
opveclib / opveclib
The Operator Vectorization Library, or OVL, is a python productivity library for defining high performance custom operators for the Tenso…
☆68Updated 8 years ago
apaszke / pytorch-dist
☆35Updated 8 years ago
NervanaSystems / caffe2neon
Tools to convert Caffe models to neon's serialization format
☆39Updated 2 years ago
uoguelph-mlrg / Theano-MPI
MPI Parallel framework for training deep learning models built in Theano
☆54Updated 7 years ago
sixin-zh / mpiT
MPI for Torch
☆61Updated 8 years ago
jonsafari / nvidia-ml-py
Bugfixing fork of Python bindings for the NVIDIA GPU Management Library
☆51Updated 8 years ago
eladhoffer / bigBatch
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training o…
☆149Updated 8 years ago
jotaf98 / curveball
Second-order optimiser for deep networks
☆76Updated 6 years ago
ezyang / onnx-pytorch
PyTorch development for onnx
☆21Updated 7 years ago
ROCm / nnvm-rocm
NNVM for ROCm Examples
☆19Updated 7 years ago
KarenUllrich / Tutorial-SoftWeightSharingForNNCompression
A tutorial on 'Soft weight-sharing for Neural Network compression' published at ICLR2017
☆145Updated 8 years ago
bamos / setGPU
Small Python library to automatically set CUDA_VISIBLE_DEVICES to the least loaded device on multi-GPU systems.
☆107Updated 2 years ago
lantiga / pytorch2c
A Python module for compiling PyTorch graphs to C
☆91Updated 7 years ago
jhjin / flattened-cnn
Flattened convolutional neural networks (1D convolution modules for Torch nn)
☆61Updated 9 years ago
szagoruyko / cunnproduction
easy embeddable Torch7 networks
☆35Updated 8 years ago
benanne / nervana_theano
A rudimentary wrapper around the fast Maxwell kernels for GEMM and convolution operations provided by nervanagpu
☆34Updated 10 years ago
ebetica / autogradpp
Direct C++ Interface to PyTorch
☆81Updated 7 years ago
pytorch / extension-ffi
Examples of C extensions for PyTorch
☆256Updated 2 years ago
diogo149 / theano_fractional_max_pooling
Fractional Max Pooling implementation in Theano
☆21Updated 9 years ago
negrinho / deep_architect_legacy
DeepArchitect: Automatically Designing and Training Deep Architectures
☆147Updated 5 years ago