facebookresearch / loop_nest
Loop Nest - Linear algebra compiler and code generator.
☆22Updated 2 years ago
Alternatives and similar repositories for loop_nest:
Users that are interested in loop_nest are comparing it to the libraries listed below
- Better bindings for Python☆17Updated 2 years ago
- ☆51Updated 7 months ago
- A tracing JIT compiler for PyTorch☆13Updated 3 years ago
- FlexAttention w/ FlashAttention3 Support☆26Updated 5 months ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆45Updated 2 weeks ago
- ☆58Updated this week
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆109Updated 3 weeks ago
- SParse AcceleRation on Tensor Architecture☆17Updated this week
- ☆18Updated 2 years ago
- Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594☆14Updated 11 months ago
- Personal solutions to the Triton Puzzles☆18Updated 8 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆39Updated this week
- MLPerf™ Mobile models☆25Updated 5 months ago
- ☆12Updated 3 years ago
- LLM training in simple, raw C/CUDA☆92Updated 10 months ago
- ☆15Updated 5 months ago
- Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research☆110Updated last year
- MLIR tools and dialect for GraphBLAS☆18Updated 2 years ago
- CuPy Benchmark☆12Updated 5 years ago
- Code for our ICLR Trustworthy ML 2020 workshop paper "Improved Image Wasserstein Attacks and Defenses"☆14Updated 4 years ago
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.☆148Updated 2 years ago
- Official Implementation of "CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks"☆17Updated 3 months ago
- cuASR: CUDA Algebra for Semirings☆35Updated 2 years ago
- Benchmarking PyTorch 2.0 different models☆21Updated 2 years ago
- Memory Optimizations for Deep Learning (ICML 2023)☆62Updated last year
- ☆9Updated 4 years ago
- ☆27Updated 2 months ago
- ☆10Updated 8 months ago
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆56Updated this week