huggingface/pytorch_block_sparse

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huggingface/pytorch_block_sparse)

huggingface / pytorch_block_sparse

Fast Block Sparse Matrices for Pytorch

☆551

Alternatives and similar repositories for pytorch_block_sparse

Users that are interested in pytorch_block_sparse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ptillet / torch-blocksparse
View on GitHub
Block-sparse primitives for PyTorch
☆158Apr 5, 2021Updated 5 years ago
openai / blocksparse
View on GitHub
Efficient GPU kernels for block-sparse matrix multiplication and convolution
☆1,067Jun 8, 2023Updated 3 years ago
YulhwaKim / cutlass_tilesparse
View on GitHub
CUDA templates for tile-sparse matrix multiplication based on CUTLASS.
☆52Mar 1, 2018Updated 8 years ago
laiguokun / Funnel-Transformer
View on GitHub
☆220Jun 8, 2020Updated 6 years ago
idiap / fast-transformers
View on GitHub
Pytorch library for fast transformer implementations
☆1,772Mar 23, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,410Apr 26, 2025Updated last year
harvardnlp / pytorch-struct
View on GitHub
Fast, general, and tested differentiable structured prediction in PyTorch
☆1,132Apr 20, 2022Updated 4 years ago
facebookresearch / SentAugment
View on GitHub
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…
☆359Feb 22, 2022Updated 4 years ago
microsoft / fastformers
View on GitHub
FastFormers - highly efficient transformer models for NLU
☆706Mar 21, 2025Updated last year
allenai / longformer
View on GitHub
Longformer: The Long-Document Transformer
☆2,201Feb 8, 2023Updated 3 years ago
google-research / sputnik
View on GitHub
A library of GPU kernels for sparse matrix operations.
☆289Nov 24, 2020Updated 5 years ago
facebookresearch / higher
View on GitHub
higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…
☆1,629Mar 25, 2022Updated 4 years ago
google-research / long-range-arena
View on GitHub
Long Range Arena for Benchmarking Efficient Transformers
☆787Dec 16, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AranKomat / Metroplex
View on GitHub
☆21Mar 15, 2023Updated 3 years ago
rusty1s / pytorch_sparse
View on GitHub
PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations
☆1,103Jun 3, 2026Updated last month
openai / sparse_attention
View on GitHub
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
☆1,615Aug 12, 2020Updated 5 years ago
lucidrains / reformer-pytorch
View on GitHub
Reformer, the efficient Transformer, in Pytorch
☆2,191Jun 21, 2023Updated 3 years ago
microsoft / fastseq
View on GitHub
An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…
☆433Aug 17, 2022Updated 3 years ago
antofuller / configaformers
View on GitHub
A python library for highly configurable transformers - easing model architecture search and experimentation.
☆48Nov 30, 2021Updated 4 years ago
NVIDIA / apex
View on GitHub
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
☆8,982Updated this week
Smerity / sha-rnn
View on GitHub
Single Headed Attention RNN - "Stop thinking with your head"
☆1,180Nov 27, 2021Updated 4 years ago
arogozhnikov / einops
View on GitHub
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
☆9,553Jul 5, 2026Updated 2 weeks ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
google-research / rigl
View on GitHub
End-to-end training of sparse deep neural networks with little-to-no performance loss.
☆337Jan 26, 2023Updated 3 years ago
open-nudge / torchlayers
View on GitHub
Shape and dimension inference (Keras-like) for PyTorch layers and neural networks
☆570Jun 13, 2022Updated 4 years ago
patrick-kidger / torchtyping
View on GitHub
Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.
☆1,484May 2, 2025Updated last year
huggingface / awesome-papers
View on GitHub
Papers & presentation materials from Hugging Face's internal science day
☆2,051Oct 31, 2020Updated 5 years ago
alexa / bort
View on GitHub
Repository for the paper "Optimal Subarchitecture Extraction for BERT"
☆470Jun 22, 2022Updated 4 years ago
lucidrains / routing-transformer
View on GitHub
Fully featured implementation of Routing Transformer
☆300Nov 6, 2021Updated 4 years ago
sacmehta / delight
View on GitHub
DeLighT: Very Deep and Light-Weight Transformers
☆469Oct 16, 2020Updated 5 years ago
google / objax
View on GitHub
☆774Jan 27, 2024Updated 2 years ago
LiyuanLucasLiu / RAdam
View on GitHub
On the Variance of the Adaptive Learning Rate and Beyond
☆2,547Jul 31, 2021Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
huggingface / nn_pruning
View on GitHub
Prune a model while finetuning or training.
☆409Jun 21, 2022Updated 4 years ago
lucidrains / token-shift-gpt
View on GitHub
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆49Jan 27, 2022Updated 4 years ago
huggingface / knockknock
View on GitHub
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
☆2,825Jun 23, 2023Updated 3 years ago
mit-han-lab / lite-transformer
View on GitHub
[ICLR 2020] Lite Transformer with Long-Short Range Attention
☆609Jul 11, 2024Updated 2 years ago
microsoft / infinibatch
View on GitHub
Efficient, check-pointed data loading for deep learning with massive data sets.
☆211Jun 12, 2023Updated 3 years ago
juntang-zhuang / Adabelief-Optimizer
View on GitHub
Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"
☆1,071Aug 9, 2024Updated last year
getkeops / keops
View on GitHub
KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
☆1,187Updated this week