IntelLabs / SLIDE_opt_ia
β74Updated last year
Related projects β
Alternatives and complementary repositories for SLIDE_opt_ia
- Nod.ai π¦ version of π» . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository β¦β107Updated this week
- benchmarking some transformer deploymentsβ26Updated last year
- A GPT, made only of MLPs, in Jaxβ55Updated 3 years ago
- Customized matrix multiplication kernelsβ53Updated 2 years ago
- π Pytorch code for the Nero optimiser.β20Updated 2 years ago
- β38Updated last year
- A collection of optimizers, some arcane others well known, for Flax.β29Updated 3 years ago
- SLIDE (Sub-LInear Deep learning Engine) written in Goβ42Updated 4 years ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodesβ237Updated last year
- β471Updated 3 years ago
- A Learnable LSH Framework for Efficient NN Trainingβ30Updated 3 years ago
- Deep learning for the Webβ36Updated 3 years ago
- A deep learning library based on Pytorch focussed on low resource language research and robustnessβ69Updated 2 years ago
- β67Updated last year
- PyTorch implementation of L2L execution algorithmβ106Updated last year
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)β116Updated 2 years ago
- β12Updated 3 years ago
- Differentiable Algorithms and Algorithmic Supervision.β105Updated last year
- β18Updated 2 years ago
- This is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as β¦β193Updated 2 years ago
- Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorchβ179Updated last year
- π© Pytorch and Jax code for the Madam optimiser.β51Updated 3 years ago
- Torch Distributed Experimentalβ116Updated 3 months ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight β¦β234Updated last year
- Learned Hyperparameter Optimizersβ58Updated 3 years ago
- GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compreβ¦β317Updated last week
- Example python package with pybind11 cpp extensionβ57Updated 3 years ago
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.β148Updated last year
- A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-β¦β66Updated last year