Accelerating DNN Convolutional Layers with Micro-batches
☆63Apr 30, 2020Updated 6 years ago
Alternatives and similar repositories for ucudnn
Users that are interested in ucudnn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Squeeze-unet Semantic Segmentation for embedded devices☆29Apr 13, 2018Updated 8 years ago
- Source for Demystifying GPU Microarchitecture through Microbenchmarking☆18May 29, 2023Updated 3 years ago
- Haystack is an analytical cache model that given a program computes the number of cache misses.☆46Jul 15, 2019Updated 6 years ago
- Dual-way gradient sparsification approach for async DNN training, based on PyTorch.☆10Dec 8, 2022Updated 3 years ago
- Python tools for NVIDIA Profiler☆21Dec 21, 2017Updated 8 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆15Jul 7, 2020Updated 5 years ago
- ☆12Sep 29, 2017Updated 8 years ago
- Script to check ONNX model compatibility against TensorRT versions using docker images☆12Nov 23, 2023Updated 2 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆269Jul 30, 2023Updated 2 years ago
- C++ framework for deep learning☆13Dec 1, 2022Updated 3 years ago
- maskrcnn implementation using chainer☆14Jun 12, 2018Updated 8 years ago
- Data and devtools for the "Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video" paper.☆21Nov 2, 2018Updated 7 years ago
- GPU Optimization and Memory Abstraction Framework☆33Oct 31, 2019Updated 6 years ago
- Torch FFI-bindings for NNPACK☆31May 26, 2017Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Self-learning hands-on for Chainer by Jupyter notebook☆43Feb 14, 2017Updated 9 years ago
- News in Privacy-Preserving Machine Learning☆12Feb 5, 2020Updated 6 years ago
- This is the open-source version of TinyTS. The code is dirty so far. We may clean the code in the future.☆21Aug 11, 2025Updated 10 months ago
- 📝 "Synthesizing Benchmarks for Predictive Modeling" (🥇 CGO'17 Best Paper)☆22Feb 10, 2023Updated 3 years ago
- This repository containts the pytorch scripts to train mixed-precision networks for microcontroller deployment, based on the memory contr…☆51May 9, 2024Updated 2 years ago
- Question Dependent Recurrent Entity Network☆13Sep 21, 2017Updated 8 years ago
- TP-PARSEC: A Task Parallel PARSEC Benchmark Suite☆11Nov 1, 2020Updated 5 years ago
- Minimum viable code for the Decodable Information Bottleneck paper. Pytorch Implementation.☆12Oct 20, 2020Updated 5 years ago
- The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github…☆33Feb 21, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- IsaacGymGrasp runs a robot grasping physics simulator that can visualize, execute, and evaluate numerous robot grasps in simultaneous env…☆18Mar 14, 2023Updated 3 years ago
- ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.☆27Jul 6, 2023Updated 2 years ago
- ☆62Mar 15, 2018Updated 8 years ago
- Synthetic Line Art Dataset☆21Jul 10, 2021Updated 4 years ago
- Repository for the code of the paper "Neural Networks Regularization Through Class-wise Invariant Representation Learning".☆12Oct 1, 2017Updated 8 years ago
- Dockerized cross-compilation for the Bela platform☆14May 24, 2020Updated 6 years ago
- A catalogue of efficient and accurate polynomial approximations☆17Feb 5, 2022Updated 4 years ago
- ☆24Mar 22, 2018Updated 8 years ago
- Code for reproducing the results from "CrAM: A Compression-Aware Minimizer" accepted at ICLR 2023☆10Mar 1, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ext_mpi_collectives☆11Jun 3, 2026Updated 3 weeks ago
- ☆20Apr 27, 2016Updated 10 years ago
- NVIDIA GPU direct RDMA using SISCI API☆18Apr 8, 2018Updated 8 years ago
- Collective communications library with various primitives for multi-machine training.☆1,437Jun 17, 2026Updated last week
- Python software to construct a simple skin around a wire frame mesh in Blender☆31Nov 5, 2019Updated 6 years ago
- Chainer x TensorRT☆34Mar 20, 2019Updated 7 years ago
- How we hooked &yet's SimpleWebRTC into XirSys's STUN and TURN servers☆19Jul 16, 2016Updated 9 years ago