jrk / gradient-halide
☆102Updated 4 years ago
Related projects: ⓘ
- an example of a CUDA extension for PyTorch using CuPy which computes the Hadamard product of two tensors☆117Updated 6 months ago
- Simple example of implementing a new Tensorflow operation and its gradient in C++.☆56Updated 5 years ago
- CNNs in Halide☆22Updated 8 years ago
- ☆22Updated 5 years ago
- Python Binding to NVRTC☆79Updated 6 years ago
- CuPy fused PyTorch neural networks ops☆274Updated 6 years ago
- Proof-of-Concept CNN in Halide☆21Updated 8 years ago
- Example code used in the CVPR 2015 tutorial☆38Updated 8 years ago
- Efficient forward propagation for BCNNs☆50Updated 7 years ago
- Boda: A C++ Framework for Efficient Experiments in Computer Vision☆63Updated 5 years ago
- ☆63Updated 6 years ago
- This is a demo project that shows how you can utilize Caffe2's modular design and build a library on top of it.☆40Updated 5 years ago
- Development a customized op in TensorFlow for convolution with sparse kernel☆28Updated 5 years ago
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆135Updated 7 years ago
- Examples of C extensions for PyTorch☆258Updated last year
- Multi-core CPU implementation of deep learning for 2D and 3D sliding window convolutional networks (ConvNets).☆94Updated 7 years ago
- Accelerating DNN Convolutional Layers with Micro-batches☆64Updated 4 years ago
- Binarized Neural Network TF training code + C matrix / eval library.☆98Updated 6 years ago
- TensorFlow util for building memory usage timeline from LOG_MEMORY messages☆65Updated 6 years ago
- Caffe implementation of accurate low-precision neural networks☆118Updated 5 years ago
- Case Studies for Halide performance against C++ and OpenCL☆37Updated 10 years ago
- ONNX Parser is a tool that automatically generates openvx inference code (CNN) from onnx binary model files.☆17Updated 5 years ago
- This code implements NICE papper☆21Updated 5 years ago
- This example builds on the parallel-forall repo separate compilation example by adding CMake to it.☆17Updated 6 years ago
- PyProf2: PyTorch Profiling tool☆83Updated 4 years ago
- BinaryNets in TensorFlow with XNOR GEMM op☆154Updated 7 years ago
- One Network to Solve Them All☆85Updated 5 years ago
- An implementation of Deep Joint Demosaicking and Denoising - SiGGRAPH Asia 2016☆109Updated last year
- ☆28Updated 6 years ago
- Test winograd convolution written in TVM for CUDA and AMDGPU☆39Updated 5 years ago