IntelLabs / SLIDE_opt_iaLinks
β74Updated last year
Alternatives and similar repositories for SLIDE_opt_ia
Users that are interested in SLIDE_opt_ia are comparing it to the libraries listed below
Sorting:
- Nod.ai π¦ version of π» . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository β¦β106Updated 7 months ago
- benchmarking some transformer deploymentsβ26Updated 2 years ago
- PyTorch interface for the IPUβ180Updated last year
- π Pytorch code for the Nero optimiser.β20Updated 2 years ago
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.β148Updated 2 years ago
- β39Updated 2 years ago
- SLIDE (Sub-LInear Deep learning Engine) written in Goβ45Updated 5 years ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodesβ241Updated 2 years ago
- Customized matrix multiplication kernelsβ56Updated 3 years ago
- Python Research Frameworkβ106Updated 2 years ago
- A GPT, made only of MLPs, in Jaxβ58Updated 4 years ago
- A collection of optimizers, some arcane others well known, for Flax.β29Updated 4 years ago
- A tracing JIT compiler for PyTorchβ13Updated 3 years ago
- Differentiable Algorithms and Algorithmic Supervision.β116Updated 2 years ago
- PyTorch implementation of L2L execution algorithmβ107Updated 2 years ago
- Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorchβ184Updated 2 years ago
- β68Updated last year
- Butterfly matrix multiplication in PyTorchβ174Updated last year
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight β¦β236Updated 2 years ago
- GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compreβ¦β346Updated last month
- a lightweight transformer library for PyTorchβ72Updated 3 years ago
- Lightweight machine learning library based on OpenCL 1.2β75Updated 4 years ago
- MONeT framework for reducing memory consumption of DNN trainingβ173Updated 4 years ago
- An Aspiring Drop-In Replacement for Pandas at Scaleβ74Updated 3 years ago
- Implementation of a Tensorflow XLA rematerialization passβ15Updated 5 years ago
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)β117Updated 3 years ago
- Small deep learning library written from scratch in Python, using NumPy/CuPy.β125Updated 2 years ago
- Massively Parallel and Asynchronous Architecture for Logic-based AIβ42Updated 2 years ago
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters iβ¦β180Updated last month
- Toy implementations of some popular ML optimizers using Python/JAXβ44Updated 4 years ago