facebookresearch / loop_nestLinks
Loop Nest - Linear algebra compiler and code generator.
☆22Updated 2 years ago
Alternatives and similar repositories for loop_nest
Users that are interested in loop_nest are comparing it to the libraries listed below
Sorting:
- FlexAttention w/ FlashAttention3 Support☆27Updated 11 months ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆64Updated 5 months ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated last year
- Udacity CS344 Introduction to Parallell Programming (https://classroom.udacity.com/courses/cs344), with assignments/materials updated to …☆46Updated 4 years ago
- Code and data for paper "(How) do Language Models Track State?"☆18Updated 6 months ago
- ☆16Updated last year
- Customized matrix multiplication kernels☆56Updated 3 years ago
- ☆19Updated 2 years ago
- code associated with paper "Sparse Bayesian Optimization"☆26Updated last year
- A LinearOperator implementation for PyTorch☆18Updated 4 years ago
- ☆52Updated last year
- SParse AcceleRation on Tensor Architecture☆17Updated 5 months ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆123Updated 2 weeks ago
- Some CUDA design patterns and a bit of template magic for CUDA☆156Updated 2 years ago
- A tracing JIT compiler for PyTorch☆13Updated 3 years ago
- Make triton easier☆47Updated last year
- Benchmarking PyTorch 2.0 different models☆20Updated 2 years ago
- Direct solver for sparse SPD matrices for nonlinear optimization. Implements supernodal Cholesky decomposition algorithm, and supports GP…☆92Updated last week
- Einsum optimization using opt_einsum and PyTorch FX graph rewriting☆21Updated 3 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated 2 years ago
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆78Updated last week
- cuASR: CUDA Algebra for Semirings☆39Updated 3 years ago
- Efficient Householder Transformation in PyTorch☆66Updated 4 years ago
- ☆61Updated this week
- Exploration into the Firefly algorithm in Pytorch☆41Updated 7 months ago
- Quantize transformers to any learned arbitrary 4-bit numeric format☆48Updated 2 months ago
- MagmaDNN: a simple deep learning framework in c++☆50Updated 5 years ago
- Better bindings for Python☆18Updated 2 years ago
- Material for the course Large-Scale Convex Optimisation at LTH, autumn 2020☆14Updated 4 years ago
- A Visual Studio Code extension for building and debugging CUDA applications.☆90Updated last week