evelinehong / Transformer_Relative_Position_PyTorchLinks

Implement the paper "Self-Attention with Relative Position Representations"

☆135

Alternatives and similar repositories for Transformer_Relative_Position_PyTorch

Users that are interested in Transformer_Relative_Position_PyTorch are comparing it to the libraries listed below

Sorting:

lucidrains / FLASH-pytorch
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"
☆368Updated last year
CyberZHG / torch-multi-head-attention
Multi-head attention in PyTorch
☆153Updated 6 years ago
fangleai / TransformerCVAE
Transformer-based Conditional Variational Autoencoder for Controllable Story Generation
☆155Updated 3 years ago
Hzfinfdu / Diffusion-BERT
ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
☆319Updated last year
OpenNLPLab / cosFormer
[ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention
☆196Updated 2 years ago
wuch15 / Fastformer
A pytorch &keras implementation and demo of Fastformer.
☆189Updated 2 years ago
dropreg / R-Drop
☆882Updated last year
tatp22 / linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
☆417Updated 3 years ago
lucidrains / local-attention
An implementation of local windowed attention for language modeling
☆470Updated 3 weeks ago
twistedcubic / attention-rank-collapse
[ICML 2021 Oral] We show pure attention suffers rank collapse, and how different mechanisms combat it.
☆165Updated 4 years ago
xuanqing94 / FLOATER
Learning to Encode Position for Transformer with Continuous Dynamical Model
☆60Updated 5 years ago
LitterBrother-Xiao / Overview-of-Non-autoregressive-Applications
☆181Updated last year
OpenVLG / DELLA
Official code for the NAACL 2022 paper "Fuse It More Deeply! A Variational Transformer with Layer-Wise Latent Variable Inference for Text…
☆35Updated 2 years ago
thu-coai / DA-Transformer
Official Implementation for the ICML2022 paper "Directed Acyclic Transformer for Non-Autoregressive Machine Translation"
☆127Updated last year
RElbers / info-nce-pytorch
PyTorch implementation of the InfoNCE loss for self-supervised learning.
☆574Updated last year
dashstander / block-recurrent-transformer
Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)
☆84Updated 3 years ago
guolinke / TUPE
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…
☆251Updated 3 years ago
guocheng2025 / Transformer-Encoder
Implementation of Transformer encoder in PyTorch
☆67Updated 4 years ago
NVIDIA / transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
☆225Updated 3 years ago
budzianowski / PyTorch-Beam-Search-Decoding
PyTorch implementation of beam search decoding for seq2seq models
☆337Updated 2 years ago
IndexFziQ / Diffusion4NLP-Papers
A paper list about diffusion models for natural language processing.
☆183Updated last year
thuml / LogME
Code release for "LogME: Practical Assessment of Pre-trained Models for Transfer Learning" (ICML 2021) and Ranking and Tuning Pre-trained…
☆208Updated last year
huanghonggit / Mask-Language-Model
pytorch； mask language model ； bert
☆72Updated 5 years ago
lucidrains / linformer
Implementation of Linformer for Pytorch
☆295Updated last year
DRSY / EMO
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
☆123Updated last year
Shark-NLP / DiffuSeq
[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
☆801Updated last year
rishikksh20 / FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
☆259Updated 4 years ago
lucidrains / bidirectional-cross-attention
A simple cross attention that updates both the source and target in one step
☆172Updated last week
jxhe / unify-parameter-efficient-tuning
Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning" (ICLR 2022)
☆535Updated 3 years ago
lucidrains / routing-transformer
Fully featured implementation of Routing Transformer
☆297Updated 3 years ago