pytorch / maskedtensorLinks

MaskedTensors for PyTorch

☆38

Alternatives and similar repositories for maskedtensor

Users that are interested in maskedtensor are comparing it to the libraries listed below

Sorting:

lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆45Updated 2 years ago
ahennequ / pytorch-custom-mma
☆29Updated 3 years ago
lucidrains / einops-exts
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆55Updated 2 years ago
gisilvs / AEF
☆33Updated 2 years ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
srush / tangent
Source-to-Source Debuggable Derivatives in Pure Python
☆15Updated last year
cpcp1998 / PermuteFormer
Code for the paper PermuteFormer
☆42Updated 4 years ago
ermongroup / fast_feedforward_computation
Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021
☆28Updated 4 years ago
google-research / precondition
☆31Updated 4 months ago
peerdavid / layerwise-batch-entropy
Layerwise Batch Entropy Regularization
☆23Updated 3 years ago
sustcsonglin / gated_linear_attention_layer
☆31Updated last year
arogozhnikov / adamw_bfloat16
AdamW optimizer for bfloat16 models in pytorch 🔥.
☆37Updated last year
lucidrains / discrete-key-value-bottleneck-pytorch
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Updated 2 years ago
lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆50Updated 3 years ago
AranKomat / Metroplex
☆21Updated 2 years ago
Newbeeer / Anytime-Auto-Regressive-Model
Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"
☆26Updated 2 years ago
srush / mamba-scans
Blog post
☆17Updated last year
lucidrains / compositional-attention-pytorch
Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…
☆51Updated 3 years ago
lucidrains / ponder-transformer
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
☆81Updated 4 years ago
giannisdaras / smyrf
[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".
☆50Updated 2 years ago
lucidrains / panoptic-transformer
Another attempt at a long-context / efficient transformer by me
☆38Updated 3 years ago
lucidrains / hourglass-transformer-pytorch
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
☆96Updated 3 years ago
gcambara / cape
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
☆42Updated 2 years ago
RobertCsordas / linear_layer_as_attention
The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …
☆16Updated 4 months ago
ColinQiyangLi / AdaCat
AdaCat
☆49Updated 3 years ago
thjashin / multires-conv
Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)
☆127Updated 2 years ago
lucidrains / rela-transformer
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Updated 3 years ago
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
HomebrewML / HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
☆68Updated 3 years ago
proger / nanokitchen
Parallel Associative Scan for Language Models
☆17Updated last year