xuanqing94 / FLOATERLinks

Learning to Encode Position for Transformer with Continuous Dynamical Model

☆60

Alternatives and similar repositories for FLOATER

Users that are interested in FLOATER are comparing it to the libraries listed below

Sorting:

xwgeng / SSAN
How Does Selective Mechanism Improve Self-attention Networks?
☆28Updated 4 years ago
lancopku / Explicit-Sparse-Transformer
code for Explicit Sparse Transformer
☆62Updated last year
Noahs-ARK / RFA
☆33Updated 4 years ago
libeineu / ODE-Transformer
This is a code repository for the ACL 2022 paper "ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generati…
☆33Updated 2 years ago
leaderj1001 / Synthesizer-Rethinking-Self-Attention-Transformer-Models
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆70Updated 5 years ago
yaohungt / TransformerDissection
[EMNLP'19] Summary for Transformer Understanding
☆53Updated 5 years ago
lzy1732008 / GaussionTransformer
For paper《Gaussian Transformer: A Lightweight Approach for Natural Language Inference》
☆28Updated 5 years ago
Junya-Chen / FlatCLR
FlatNCE: A Novel Contrastive Representation Learning Objective
☆90Updated 3 years ago
rabeehk / vibert
Implementation for Variational Information Bottleneck for Effective Low-resource Fine-tuning, ICLR 2021
☆40Updated 4 years ago
lioutasb / TaLKConvolutions
Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)
☆29Updated 4 years ago
FreedomIntelligence / complex-order
☆83Updated 5 years ago
CharizardAcademy / convtransformer
Code for the ACL2020 paper Character-Level Translation with Self-Attention
☆31Updated 4 years ago
microsoft / EA-VQ-VAE
This repo provides the code for the ACL 2020 paper "Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEnco…
☆55Updated 4 years ago
microsoft / EfficientLongSequenceModeling
☆51Updated 2 years ago
cshjin / GCL
List of Publications in Graph Contrastive Learning
☆35Updated 3 years ago
sarthmit / Compositional-Attention
Code to reproduce the results for Compositional Attention
☆60Updated 2 years ago
prajjwal1 / adaptive_transformer
Code for the paper "Adaptive Transformers for Learning Multimodal Representations" (ACL SRW 2020)
☆43Updated 2 years ago
ShannonAI / GNN-LM
☆46Updated 3 years ago
sh0416 / clrcmd
Official Repository for CLRCMD (Appear in ACL2022)
☆41Updated 2 years ago
Victorwz / VaLM
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Updated 2 years ago
LiqunChen0606 / OT-Seq2Seq
code for paper "Improving Sequence-to-Sequence Learning via Optimal Transport"
☆68Updated 6 years ago
kugwzk / DiDE
Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”
☆30Updated 2 years ago
fawazsammani / mogrifier-lstm-pytorch
Implementation of Mogrifier LSTM in PyTorch
☆35Updated 5 years ago
zlinao / Variational-Transformer
Variational Transformers for Diverse Response Generation
☆81Updated 11 months ago
mlpc-ucsd / BERT_Convolutions
(ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.
☆21Updated 2 years ago
ermongroup / subsets
Code for Reparameterizable Subset Sampling via Continuous Relaxations, IJCAI 2019.
☆57Updated last year
evelinehong / Transformer_Relative_Position_PyTorch
Implement the paper "Self-Attention with Relative Position Representations"
☆133Updated 4 years ago
yikangshen / MoA
Mixture of Attention Heads
☆47Updated 2 years ago
CogComp / TAWT
Weighted Training for Cross-Task Learning
☆15Updated 2 years ago
rosewang2008 / language_modeling_via_stochastic_processes
Language modeling via stochastic processes. Oral @ ICLR 2022.
☆138Updated 2 years ago