David-Levinthal / machine-learning
repository for notes and data from machine learning studies
☆11Updated 5 years ago
Alternatives and similar repositories for machine-learning:
Users that are interested in machine-learning are comparing it to the libraries listed below
- A Deep Learning Meta-Framework and HPC Benchmarking Library☆81Updated 2 years ago
- Kernel Fusion and Runtime Compilation Based on NNVM☆70Updated 8 years ago
- A simple tool to profile performance of multiple combinations of GEMM of cuBLAS☆25Updated 4 years ago
- Intel(R) Machine Learning Scaling Library is a library providing an efficient implementation of communication patterns used in deep learn…☆109Updated 2 years ago
- Flexible GPGPU instrumentation☆86Updated 5 years ago
- Convert nvprof profiles into about:tracing compatible JSON files☆69Updated 4 years ago
- The SHOC Benchmark Suite☆251Updated 3 years ago
- ☆50Updated 5 years ago
- Stretching GPU performance for GEMMs and tensor contractions.☆235Updated this week
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆297Updated 6 years ago
- Assembler for NVIDIA Volta and Turing GPUs☆215Updated 3 years ago
- ☆61Updated 3 months ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆81Updated 5 years ago
- ☆20Updated 2 years ago
- this is the release repository of superneurons☆52Updated 4 years ago
- STREAM, for lots of devices written in many programming models☆332Updated 7 months ago
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆21Updated 4 months ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- Multi-GPU Computing Benchmark Suite (CUDA)☆42Updated 7 years ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆129Updated last year
- ☆91Updated 8 years ago
- Experimental projects related to TensorRT☆95Updated this week
- Python bindings for NVTX☆66Updated last year
- Machine Learning Toolkit for Extreme Scale (MaTEx)☆111Updated 6 years ago
- Chai☆43Updated last year
- Reference workloads for modern deep learning methods.☆73Updated 2 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆67Updated last week
- ☆21Updated last month
- CUDA Tensor Transpose (cuTT) library☆51Updated 7 years ago
- Original Python version of Intel® Nervana™ Graph☆215Updated 2 years ago