Fast Block Sparse Matrices for Pytorch
☆549Jan 21, 2021Updated 5 years ago
Alternatives and similar repositories for pytorch_block_sparse
Users that are interested in pytorch_block_sparse are comparing it to the libraries listed below
Sorting:
- Block-sparse primitives for PyTorch☆158Apr 5, 2021Updated 4 years ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,063Jun 8, 2023Updated 2 years ago
- ☆221Jun 8, 2020Updated 5 years ago
- Pytorch library for fast transformer implementations☆1,762Mar 23, 2023Updated 2 years ago
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,123Apr 20, 2022Updated 3 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- FastFormers - highly efficient transformer models for NLU☆709Mar 21, 2025Updated 11 months ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.☆50Mar 1, 2018Updated 8 years ago
- higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…☆1,627Mar 25, 2022Updated 3 years ago
- Longformer: The Long-Document Transformer☆2,188Feb 8, 2023Updated 3 years ago
- PyTorch extensions for high performance and large scale training.☆3,400Apr 26, 2025Updated 10 months ago
- Long Range Arena for Benchmarking Efficient Transformers☆781Dec 16, 2023Updated 2 years ago
- ☆21Mar 15, 2023Updated 2 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago
- Single Headed Attention RNN - "Stop thinking with your head"☆1,180Nov 27, 2021Updated 4 years ago
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,610Aug 12, 2020Updated 5 years ago
- PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations☆1,092Aug 12, 2025Updated 6 months ago
- End-to-end training of sparse deep neural networks with little-to-no performance loss.☆335Jan 26, 2023Updated 3 years ago
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)☆9,401Feb 20, 2026Updated last week
- Papers & presentation materials from Hugging Face's internal science day☆2,052Oct 31, 2020Updated 5 years ago
- Shape and dimension inference (Keras-like) for PyTorch layers and neural networks☆570Jun 13, 2022Updated 3 years ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆433Aug 17, 2022Updated 3 years ago
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆8,926Updated this week
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- On the Variance of the Adaptive Learning Rate and Beyond☆2,548Jul 31, 2021Updated 4 years ago
- Repository for the paper "Optimal Subarchitecture Extraction for BERT"☆470Jun 22, 2022Updated 3 years ago
- Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"☆1,068Aug 9, 2024Updated last year
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Jul 26, 2021Updated 4 years ago
- Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.☆1,472May 2, 2025Updated 9 months ago
- A library of GPU kernels for sparse matrix operations.☆283Nov 24, 2020Updated 5 years ago
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆611Jul 11, 2024Updated last year
- Parallelformers: An Efficient Model Parallelization Toolkit for Deployment☆791Apr 24, 2023Updated 2 years ago
- PyTorch implementation of Sinusodial Representation networks (SIREN)☆267Dec 8, 2022Updated 3 years ago
- KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows☆1,158Feb 6, 2026Updated 3 weeks ago
- ☆774Jan 27, 2024Updated 2 years ago
- PyTorch implementation of L2L execution algorithm☆108Jan 16, 2023Updated 3 years ago
- Hopfield Networks is All You Need☆1,901Apr 23, 2023Updated 2 years ago
- A GPipe implementation in PyTorch☆863Jul 25, 2024Updated last year