facebookresearch / loop_nestLinks
Loop Nest - Linear algebra compiler and code generator.
☆21Updated 3 years ago
Alternatives and similar repositories for loop_nest
Users that are interested in loop_nest are comparing it to the libraries listed below
Sorting:
- FlexAttention w/ FlashAttention3 Support☆27Updated last year
- ☆19Updated 3 years ago
- ☆16Updated last year
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated 2 years ago
- Udacity CS344 Introduction to Parallell Programming (https://classroom.udacity.com/courses/cs344), with assignments/materials updated to …☆46Updated 4 years ago
- ☆54Updated last year
- Code and data for paper "(How) do Language Models Track State?"☆20Updated 8 months ago
- A LinearOperator implementation for PyTorch☆18Updated 4 years ago
- Customized matrix multiplication kernels☆57Updated 3 years ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Updated last year
- SParse AcceleRation on Tensor Architecture☆17Updated 8 months ago
- code associated with paper "Sparse Bayesian Optimization"☆26Updated 2 years ago
- Better bindings for Python☆19Updated 2 years ago
- A tracing JIT compiler for PyTorch☆13Updated 3 years ago
- Abstractions of memory, allocator, vector, tuple, shared_ptr, unique_ptr, bitset, variant and string working on both CPU and GPU☆31Updated 3 months ago
- PyTorch interface for the IPU☆181Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆80Updated last year
- ⛰️ RockyML - A High-Performance Scientific Computing Framework for Non-smooth Machine Learning Problems☆20Updated 2 years ago
- Einsum optimization using opt_einsum and PyTorch FX graph rewriting☆22Updated 3 years ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆69Updated 7 months ago
- APPy (Annotated Parallelism for Python) enables users to annotate loops and tensor expressions in Python with compiler directives akin to…☆27Updated this week
- Event-Triggered Communication in Parallel Machine Learning☆29Updated 4 years ago
- cuASR: CUDA Algebra for Semirings☆42Updated 3 years ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆46Updated 2 years ago
- Python package for Geometric / Clifford Algebra with Pytorch.☆14Updated last month
- Some CUDA design patterns and a bit of template magic for CUDA☆157Updated 2 years ago
- Personal solutions to the Triton Puzzles☆20Updated last year
- Make triton easier☆49Updated last year
- benchmarking some transformer deployments☆26Updated 2 weeks ago
- Exploration into the Firefly algorithm in Pytorch☆41Updated 9 months ago