☆22Nov 7, 2018Updated 7 years ago
Alternatives and similar repositories for vDNN
Users that are interested in vDNN are comparing it to the libraries listed below
Sorting:
- Implementation of vDNN++; an improvement over vDNN☆18Dec 7, 2018Updated 7 years ago
- this is the release repository of superneurons☆54Feb 13, 2021Updated 5 years ago
- ☆13Feb 22, 2023Updated 3 years ago
- Thinking is hard - automate it☆18Aug 24, 2022Updated 3 years ago
- FTPipe and related pipeline model parallelism research.☆44May 16, 2023Updated 2 years ago
- Repository to go along with the paper "Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines"☆10Mar 31, 2022Updated 3 years ago
- DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression☆11Oct 7, 2020Updated 5 years ago
- Simple PyTorch profiler that combines DeepSpeed Flops Profiler and TorchInfo☆11Feb 12, 2023Updated 3 years ago
- ☆12May 3, 2020Updated 5 years ago
- A host-based framework that transparently extends the GPU addressable global memory space beyond the host memory using NVM-backed data po…☆63Sep 11, 2020Updated 5 years ago
- [ACM EuroSys 2023] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access☆56Aug 6, 2025Updated 7 months ago
- ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).☆40Sep 10, 2024Updated last year
- PetPS: Supporting Huge Embedding Models with Tiered Memory☆33May 21, 2024Updated last year
- ☆20Nov 12, 2025Updated 3 months ago
- ☆17Dec 9, 2022Updated 3 years ago
- ☆19Jul 26, 2021Updated 4 years ago
- Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference☆19Mar 5, 2023Updated 3 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memory☆137Feb 21, 2022Updated 4 years ago
- ☆21Nov 29, 2022Updated 3 years ago
- ☆18Mar 15, 2020Updated 5 years ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆127May 9, 2022Updated 3 years ago
- Hi-DMM: High-Performance Dynamic Memory Management in HLS (High-Level Synthesis)☆25Oct 30, 2018Updated 7 years ago
- ☆26Dec 5, 2022Updated 3 years ago
- An experimental communicating attention kernel based on DeepEP.☆35Jul 29, 2025Updated 7 months ago
- ☆23Jun 21, 2023Updated 2 years ago
- ☆23Jun 5, 2019Updated 6 years ago
- ☆29Oct 27, 2023Updated 2 years ago
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆58Aug 21, 2024Updated last year
- A hybrid cache sharing-partitioning tool for systems with Intel CAT support.☆31Mar 28, 2018Updated 7 years ago
- ☆36Jan 21, 2021Updated 5 years ago
- to study xilinx fpga using Zybo Z7-20 board☆14Mar 13, 2024Updated last year
- Carbon Explorer helps evaluating solutions make datacenters operate on renewable energy.☆88Nov 8, 2024Updated last year
- Fine-grained GPU sharing primitives☆148Jul 28, 2025Updated 7 months ago
- Spark, Cassandra, Tessellation and ArcGIS☆10Jan 18, 2015Updated 11 years ago
- ☆11Aug 23, 2023Updated 2 years ago
- Notes and Examples to get started Parallel Computing with CUDA.☆13Nov 1, 2019Updated 6 years ago
- A simple script to plot the Roofline model for given HW platforms and applications☆10Aug 22, 2024Updated last year
- A curated list of awesome Gemini CLI extensions.☆35Feb 4, 2026Updated last month
- ☆40Nov 28, 2022Updated 3 years ago