Accelerating DNN Convolutional Layers with Micro-batches
☆63Apr 30, 2020Updated 5 years ago
Alternatives and similar repositories for ucudnn
Users that are interested in ucudnn are comparing it to the libraries listed below
Sorting:
- A Deep Learning Meta-Framework and HPC Benchmarking Library☆81May 23, 2022Updated 3 years ago
- Squeeze-unet Semantic Segmentation for embedded devices☆29Apr 13, 2018Updated 7 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Aug 1, 2021Updated 4 years ago
- ONNX SEA-RAFT, optical flow☆14Jan 5, 2026Updated 2 months ago
- a model zoo☆11Jul 19, 2017Updated 8 years ago
- Question Dependent Recurrent Entity Network☆13Sep 21, 2017Updated 8 years ago
- Compiler toolchain to enable generation of high-level DSLs for geophysical fluid dynamics models☆29Mar 22, 2023Updated 2 years ago
- Absinthe is an optimization framework to fuse and tile stencil codes in one shot☆14Jul 17, 2019Updated 6 years ago
- Repository for the code of the paper "Neural Networks Regularization Through Class-wise Invariant Representation Learning".☆12Oct 1, 2017Updated 8 years ago
- Script to check ONNX model compatibility against TensorRT versions using docker images☆12Nov 23, 2023Updated 2 years ago
- ☆13Oct 10, 2018Updated 7 years ago
- Dual-way gradient sparsification approach for async DNN training, based on PyTorch.☆11Dec 8, 2022Updated 3 years ago
- TP-PARSEC: A Task Parallel PARSEC Benchmark Suite☆11Nov 1, 2020Updated 5 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆267Jul 30, 2023Updated 2 years ago
- ☆12Sep 29, 2017Updated 8 years ago
- implement mxnet face and insightface with mapr streams for near real time face detection and recognition in video streams with residual n…☆15Apr 15, 2018Updated 7 years ago
- Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware☆15Mar 1, 2022Updated 4 years ago
- Code accompanying the paper "PrimiTect: Fast Continuous Hough Voting for Primitive Detection" by C. Sommer, Y. Sun, E. Bylow and D. Creme…☆33Mar 25, 2021Updated 4 years ago
- Depth_conv for MobileNet☆30Jun 22, 2020Updated 5 years ago
- Automatic differentiation with uarray/unumpy.☆16Mar 7, 2021Updated 5 years ago
- Some deep learning models written with mxnet and C++11.☆12Feb 6, 2018Updated 8 years ago
- ☆13Apr 10, 2017Updated 8 years ago
- The network of the faceboxes☆15Sep 26, 2017Updated 8 years ago
- IsaacGymGrasp runs a robot grasping physics simulator that can visualize, execute, and evaluate numerous robot grasps in simultaneous env…☆18Mar 14, 2023Updated 2 years ago
- the loss function in Aritcal ‘Focal Loss for Dense Object Detection‘’☆17Sep 20, 2017Updated 8 years ago
- Cost-Effective Object Detection: Active Sample Mining with Switchable Selection Criteria☆12Dec 1, 2018Updated 7 years ago
- maskrcnn implementation using chainer☆14Jun 12, 2018Updated 7 years ago
- A Gluon Implement of EfficientNet☆12Jul 3, 2019Updated 6 years ago
- Good Features to Correlate for Visual Tracking☆17Jul 17, 2018Updated 7 years ago
- Analogs of Linguistic Structure in Deep Representations☆19Jul 27, 2017Updated 8 years ago
- Train Neuronal networks to automate your home☆19Mar 1, 2023Updated 3 years ago
- A regularly updated comparison of CNN architectures in terms of accuracy, operations and model size☆45Jan 23, 2019Updated 7 years ago
- CHIPKIT: An agile, reusable open-source framework for rapid test chip development☆42May 24, 2020Updated 5 years ago
- ☆62Mar 15, 2018Updated 7 years ago
- Downsampled Open Images Dataset V4 with 15.4 M bounding boxes for 600 categories on 1.9M images☆51Dec 19, 2018Updated 7 years ago
- An Example of MXNet Models Comilation and Deployment with NNVM in C++☆16Apr 25, 2018Updated 7 years ago
- trust region policy optimization base on gym and tensorflow, can run in distribution mode☆15May 6, 2017Updated 8 years ago
- A MXNet implementation of Xception☆20Sep 26, 2017Updated 8 years ago
- ICML 2017 accepted papers on arXiv.org☆17May 25, 2017Updated 8 years ago