IntelLabs / SLIDE_opt_iaLinks

☆74

Alternatives and similar repositories for SLIDE_opt_ia

Users that are interested in SLIDE_opt_ia are comparing it to the libraries listed below

Sorting:

nod-ai / SRT
Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …
☆106Updated 7 months ago
nod-ai / transformer-benchmarks
benchmarking some transformer deployments
☆26Updated 2 years ago
graphcore / poptorch
PyTorch interface for the IPU
☆180Updated last year
jxbz / nero
👑 Pytorch code for the Nero optimiser.
☆20Updated 2 years ago
facebookresearch / loop_tool
A thin, highly portable toolkit for efficiently compiling dense loop-based computation.
☆148Updated 2 years ago
shawwn / ml-notes
☆39Updated 2 years ago
nlpodyssey / goslide
SLIDE (Sub-LInear Deep learning Engine) written in Go
☆45Updated 5 years ago
kingoflolz / swarm-jax
Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
☆241Updated 2 years ago
DeMoriarty / custom_matmul_kernels
Customized matrix multiplication kernels
☆56Updated 3 years ago
EleutherAI / pyfra
Python Research Framework
☆106Updated 2 years ago
lucidrains / mlp-gpt-jax
A GPT, made only of MLPs, in Jax
☆58Updated 4 years ago
nestordemeure / flaxOptimizers
A collection of optimizers, some arcane others well known, for Flax.
☆29Updated 4 years ago
nunoplopes / torchy
A tracing JIT compiler for PyTorch
☆13Updated 3 years ago
Felix-Petersen / algovision
Differentiable Algorithms and Algorithmic Supervision.
☆116Updated 2 years ago
TezRomacH / layer-to-layer-pytorch
PyTorch implementation of L2L execution algorithm
☆107Updated 2 years ago
AminRezaei0x443 / memory-efficient-attention
Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch
☆184Updated 2 years ago
srush / torch-queue
☆68Updated last year
HazyResearch / butterfly
Butterfly matrix multiplication in PyTorch
☆174Updated last year
facebookresearch / diffq
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …
☆236Updated 2 years ago
facebookresearch / dietgpu
GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compre…
☆346Updated last month
MathInf / toroidal
a lightweight transformer library for PyTorch
☆72Updated 3 years ago
iperov / litenn
Lightweight machine learning library based on OpenCL 1.2
☆75Updated 4 years ago
utsaslab / MONeT
MONeT framework for reducing memory consumption of DNN training
☆173Updated 4 years ago
nv-legate / legate.pandas
An Aspiring Drop-In Replacement for Pandas at Scale
☆74Updated 3 years ago
microsoft / tensorflow-rematerialization
Implementation of a Tensorflow XLA rematerialization pass
☆15Updated 5 years ago
yandex-research / DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
☆117Updated 3 years ago
sradc / SmallPebble
Small deep learning library written from scratch in Python, using NumPy/CuPy.
☆125Updated 2 years ago
cair / PyTsetlinMachineCUDA
Massively Parallel and Asynchronous Architecture for Logic-based AI
☆42Updated 2 years ago
pytorch / multipy
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…
☆180Updated last month
shreyansh26 / ML-Optimizers-JAX
Toy implementations of some popular ML optimizers using Python/JAX
☆44Updated 4 years ago