facebookresearch / loop_nestLinks

Loop Nest - Linear algebra compiler and code generator.

☆22

Alternatives and similar repositories for loop_nest

Users that are interested in loop_nest are comparing it to the libraries listed below

Sorting:

GindaChen / FlexFlashAttention3
FlexAttention w/ FlashAttention3 Support
☆26Updated 9 months ago
facebookresearch / SparseBO
code associated with paper "Sparse Bayesian Optimization"
☆26Updated last year
belindal / state-tracking
Code and data for paper "(How) do Language Models Track State?"
☆14Updated 3 months ago
lucidrains / firefly-torch
Exploration into the Firefly algorithm in Pytorch
☆40Updated 5 months ago
facebookresearch / adaptive_scheduling
Experimental scripts for researching data adaptive learning rate scheduling.
☆23Updated last year
StoneY1 / Reproducing-BowNet
☆9Updated 4 years ago
NVIDIA / free-threaded-python
No-GIL Python environment featuring NVIDIA Deep Learning libraries.
☆62Updated 3 months ago
iree-org / iree-jax
☆52Updated 11 months ago
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆44Updated 2 years ago
Z-T-WANG / LaProp-Optimizer
Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"
☆29Updated 4 years ago
bwasti / better_bindings
Better bindings for Python
☆17Updated 2 years ago
alexzhang13 / Triton-Puzzles-Solutions
Personal solutions to the Triton Puzzles
☆19Updated 11 months ago
ML-KULeuven / klay
Sparse Circuits on the GPU (ICLR2025)
☆12Updated last month
google / jaxonnxruntime
A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.
☆115Updated 3 weeks ago
habanero-lab / APPy
APPy (Annotated Parallelism for Python) enables users to annotate loops and tensor expressions in Python with compiler directives akin to…
☆24Updated 3 weeks ago
srush / drop7
☆18Updated last year
hazan-lab / flash-stu
PyTorch implementation of the Flash Spectral Transform Unit.
☆17Updated 9 months ago
cornellius-gp / linear_operator.old
A LinearOperator implementation for PyTorch
☆18Updated 4 years ago
lucidrains / blackbox-gradient-sensing
Implementation and explorations into Blackbox Gradient Sensing (BGS), an evolutionary strategies approach proposed in a Google Deepmind p…
☆17Updated last month
codekansas / rwkv
RWKV model implementation
☆38Updated 2 years ago
ahennequ / cuda-tensorcores-register-mapping
☆18Updated 2 years ago
kyutai-labs / jax-flash-attn3
JAX bindings for the flash-attention3 kernels
☆11Updated 11 months ago
lucidrains / kalman-filtering-attention
Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"
☆58Updated last year
FrancescoSaverioZuppichini / pytorch-2.0-benchmark
Benchmarking PyTorch 2.0 different models
☆21Updated 2 years ago
edwardjhu / improved_wasserstein
Code for our ICLR Trustworthy ML 2020 workshop paper "Improved Image Wasserstein Attacks and Defenses"
☆14Updated 5 years ago
f-dangel / vivit
[TMLR 2022] Curvature access through the generalized Gauss-Newton's low-rank structure: Eigenvalues, eigenvectors, directional derivative…
☆17Updated last year
facebookresearch / coocmap
code for paper "Accessing higher dimensions for unsupervised word translation"
☆21Updated 2 years ago
UmerHA / triton_util
Make triton easier
☆47Updated last year
srush / tangent
Source-to-Source Debuggable Derivatives in Pure Python
☆15Updated last year
DeMoriarty / custom_matmul_kernels
Customized matrix multiplication kernels
☆56Updated 3 years ago