RUSH-LAB / SLIDE
☆471Updated 3 years ago
Alternatives and similar repositories for SLIDE:
Users that are interested in SLIDE are comparing it to the libraries listed below
- Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"☆1,094Updated 3 years ago
- Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.☆474Updated 5 months ago
- Fast Block Sparse Matrices for Pytorch☆546Updated 4 years ago
- common in-memory tensor structure☆968Updated this week
- A uniform interface to run deep learning models from multiple frameworks☆935Updated last year
- PyTorch elastic training☆730Updated 2 years ago
- GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compre…☆322Updated last week
- Simple Training and Deployment of Fast End-to-End Binary Networks☆157Updated 3 years ago
- A GPU performance profiling tool for PyTorch models☆505Updated 3 years ago
- Mesh TensorFlow: Model Parallelism Made Easier☆1,603Updated last year
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …☆235Updated last year
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆237Updated last year
- ☆771Updated last year
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.☆148Updated 2 years ago
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆221Updated this week
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,280Updated this week
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆979Updated 6 months ago
- Haste: a fast, simple, and open RNN library☆330Updated last year
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,037Updated 11 months ago
- PyTorch, TensorFlow, JAX and NumPy — all of them natively using the same code☆695Updated 2 years ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,039Updated last year
- An open-source efficient deep learning framework/compiler, written in python.☆692Updated last month
- The Tensor Algebra SuperOptimizer for Deep Learning☆704Updated 2 years ago
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆106Updated 2 months ago
- The official page of ROCm/PyTorch will contain information that is always confusing. On this page we will endeavor to describe accurate i…☆87Updated 4 years ago
- 10x faster matrix and vector operations☆2,481Updated 2 years ago
- Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)☆486Updated last year
- Library for 8-bit optimizers and quantization routines.☆717Updated 2 years ago
- 3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1☆136Updated 2 years ago
- A platform for managing machine learning experiments☆841Updated last week