skylineprof / skylineLinks
π Interactive in-editor performance profiling, visualization, and debugging for PyTorch neural networks.
β32Updated 2 years ago
Alternatives and similar repositories for skyline
Users that are interested in skyline are comparing it to the libraries listed below
Sorting:
- MONeT framework for reducing memory consumption of DNN trainingβ174Updated 4 years ago
- PyTorch implementation of L2L execution algorithmβ108Updated 2 years ago
- Training neural networks in TensorFlow 2.0 with 5x less memoryβ135Updated 3 years ago
- Research and development for optimizing transformersβ130Updated 4 years ago
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616β132Updated 2 years ago
- Lightweight and Parallel Deep Learning Frameworkβ264Updated 2 years ago
- Programmable Neural Network Compressionβ150Updated 3 years ago
- Block-sparse primitives for PyTorchβ160Updated 4 years ago
- A tensor-aware point-to-point communication primitive for machine learningβ274Updated last month
- β42Updated 9 months ago
- ParaDnn: A systematic performance analysis methodology for deep learning.β39Updated 5 years ago
- [Prototype] Tools for the concurrent manipulation of variably sized Tensors.β251Updated 2 years ago
- β253Updated last year
- A GPU performance profiling tool for PyTorch modelsβ505Updated 4 years ago
- β57Updated 3 years ago
- PyTorch interface for the IPUβ181Updated last year
- This repository contains the results and code for the MLPerfβ’ Training v0.7 benchmark.β57Updated 2 years ago
- sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Dataβ64Updated last year
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β161Updated 2 weeks ago
- A runtime fault injection tool for PyTorchβ119Updated last year
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large β¦β65Updated 3 years ago
- Simple Distributed Deep Learning on TensorFlowβ134Updated 3 months ago
- Train ImageNet in 18 minutes on AWSβ133Updated last year
- GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compreβ¦β353Updated 3 months ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Accelerationβ200Updated 3 years ago
- β108Updated 4 years ago
- PyProf2: PyTorch Profiling toolβ82Updated 5 years ago
- Large Model Support in PyTorchβ134Updated 3 years ago
- Customized matrix multiplication kernelsβ56Updated 3 years ago
- Reference implementations of popular Binarized Neural Networksβ108Updated this week