openai/blocksparse

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/openai/blocksparse)

openai / blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

☆1,067

Alternatives and similar repositories for blocksparse

Users that are interested in blocksparse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

openai / sparse_attention
View on GitHub
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
☆1,615Aug 12, 2020Updated 5 years ago
ptillet / torch-blocksparse
View on GitHub
Block-sparse primitives for PyTorch
☆158Apr 5, 2021Updated 5 years ago
huggingface / pytorch_block_sparse
View on GitHub
Fast Block Sparse Matrices for Pytorch
☆551Jan 21, 2021Updated 5 years ago
google-research / sputnik
View on GitHub
A library of GPU kernels for sparse matrix operations.
☆289Nov 24, 2020Updated 5 years ago
YulhwaKim / cutlass_tilesparse
View on GitHub
CUDA templates for tile-sparse matrix multiplication based on CUTLASS.
☆52Mar 1, 2018Updated 8 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,104Updated this week
facebookresearch / TensorComprehensions
View on GitHub
A domain specific language to express machine learning workloads.
☆1,767Apr 28, 2023Updated 3 years ago
openai / openai-gemm
View on GitHub
Open single and half precision gemm implementations
☆396Apr 2, 2023Updated 3 years ago
openai / distribution_augmentation
View on GitHub
Code for the paper, "Distribution Augmentation for Generative Modeling", ICML 2020.
☆131Apr 24, 2023Updated 3 years ago
NVIDIA / apex
View on GitHub
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
☆8,982Updated this week
facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
pytorch / FBGEMM
View on GitHub
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
☆1,570Updated this week
google / gemmlowp
View on GitHub
Low-precision matrix multiplication
☆1,843Jan 29, 2024Updated 2 years ago
tensor-compiler / taco
View on GitHub
The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
☆1,364Apr 14, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
asappresearch / sru
View on GitHub
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
☆2,107Jan 4, 2022Updated 4 years ago
salesforce / pytorch-qrnn
View on GitHub
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM
☆1,263Feb 12, 2022Updated 4 years ago
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,410Apr 26, 2025Updated last year
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,072Jan 3, 2023Updated 3 years ago
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,725Updated this week
NVIDIA / DALI
View on GitHub
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep lear…
☆5,727Updated this week
pytorch / glow
View on GitHub
Compiler for Neural Network hardware accelerators
☆3,321May 11, 2024Updated 2 years ago
NVIDIA / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆4,892Updated this week
horovod / horovod
View on GitHub
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
☆14,695Jun 20, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apache / tvm
View on GitHub
Open Machine Learning Compiler Framework
☆13,588Updated this week
kakaobrain / torchgpipe
View on GitHub
A GPipe implementation in PyTorch
☆865Jul 25, 2024Updated last year
openai / finetune-transformer-lm
View on GitHub
Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
☆2,306Jan 25, 2019Updated 7 years ago
openai / ceph-chef
View on GitHub
Chef cookbooks for managing a Ceph cluster
☆11Apr 2, 2023Updated 3 years ago
tensorflow / lingvo
View on GitHub
Lingvo
☆2,860Jun 22, 2026Updated 3 weeks ago
facebookresearch / deepfloat
View on GitHub
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.
☆400Mar 11, 2023Updated 3 years ago
NVIDIA / cub
View on GitHub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,840Oct 9, 2023Updated 2 years ago
cybertronai / gradient-checkpointing
View on GitHub
Make huge neural nets fit in memory
☆2,843Apr 26, 2020Updated 6 years ago
szagoruyko / diracnets
View on GitHub
Training Very Deep Neural Networks Without Skip-Connections
☆590Jun 9, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
jax-ml / jax
View on GitHub
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
☆36,015Updated this week
kastnerkyle / representation_mixing
View on GitHub
Demos, pretrained models, and (WIP) code supporting Representation Mixing
☆51Dec 18, 2018Updated 7 years ago
openai / glow
View on GitHub
Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
☆3,183Jul 23, 2024Updated last year
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,108Updated this week
shrubb / box-convolutions
View on GitHub
PyTorch code for the "Deep Neural Networks with Box Convolutions" paper
☆508Jan 20, 2020Updated 6 years ago
NVIDIA / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆6,439Mar 27, 2024Updated 2 years ago
facebookresearch / XLM
View on GitHub
PyTorch original implementation of Cross-lingual Language Model Pretraining.
☆2,927Feb 14, 2023Updated 3 years ago