Fast Block Sparse Matrices for Pytorch
☆551Jan 21, 2021Updated 5 years ago
Alternatives and similar repositories for pytorch_block_sparse
Users that are interested in pytorch_block_sparse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Block-sparse primitives for PyTorch☆158Apr 5, 2021Updated 5 years ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,066Jun 8, 2023Updated 3 years ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.☆52Mar 1, 2018Updated 8 years ago
- ☆220Jun 8, 2020Updated 6 years ago
- Pytorch library for fast transformer implementations☆1,771Mar 23, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- PyTorch extensions for high performance and large scale training.☆3,407Apr 26, 2025Updated last year
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,130Apr 20, 2022Updated 4 years ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆358Feb 22, 2022Updated 4 years ago
- FastFormers - highly efficient transformer models for NLU☆707Mar 21, 2025Updated last year
- Longformer: The Long-Document Transformer☆2,196Feb 8, 2023Updated 3 years ago
- A library of GPU kernels for sparse matrix operations.☆288Nov 24, 2020Updated 5 years ago
- higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…☆1,627Mar 25, 2022Updated 4 years ago
- Long Range Arena for Benchmarking Efficient Transformers☆788Dec 16, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆21Mar 15, 2023Updated 3 years ago
- PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations☆1,099Jun 3, 2026Updated last week
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,613Aug 12, 2020Updated 5 years ago
- Reformer, the efficient Transformer, in Pytorch☆2,192Jun 21, 2023Updated 2 years ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆435Aug 17, 2022Updated 3 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆8,972Jun 1, 2026Updated last week
- End-to-end training of sparse deep neural networks with little-to-no performance loss.☆337Jan 26, 2023Updated 3 years ago
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)☆9,507May 31, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Single Headed Attention RNN - "Stop thinking with your head"☆1,181Nov 27, 2021Updated 4 years ago
- Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.☆1,481May 2, 2025Updated last year
- Shape and dimension inference (Keras-like) for PyTorch layers and neural networks☆570Jun 13, 2022Updated 3 years ago
- Papers & presentation materials from Hugging Face's internal science day☆2,051Oct 31, 2020Updated 5 years ago
- Repository for the paper "Optimal Subarchitecture Extraction for BERT"☆471Jun 22, 2022Updated 3 years ago
- Fully featured implementation of Routing Transformer☆301Nov 6, 2021Updated 4 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- ☆774Jan 27, 2024Updated 2 years ago
- Prune a model while finetuning or training.☆407Jun 21, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- On the Variance of the Adaptive Learning Rate and Beyond☆2,550Jul 31, 2021Updated 4 years ago
- Efficient, check-pointed data loading for deep learning with massive data sets.☆211Jun 12, 2023Updated 2 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- 🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code☆2,825Jun 23, 2023Updated 2 years ago
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆611Jul 11, 2024Updated last year
- KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows☆1,176Updated this week
- Cascaded Text Generation with Markov Transformers☆130Mar 20, 2023Updated 3 years ago