Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆71May 28, 2020Updated 5 years ago
Alternatives and similar repositories for Synthesizer-Rethinking-Self-Attention-Transformer-Models
Users that are interested in Synthesizer-Rethinking-Self-Attention-Transformer-Models are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆75Dec 8, 2022Updated 3 years ago
- Bag of MLP☆20May 31, 2021Updated 4 years ago
- Source code for "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation"☆18Aug 31, 2019Updated 6 years ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- Implementing Randomly Wired Neural Networks for Image Recognition, Using CIFAR-10 dataset, CIFAR-100 dataset☆89May 26, 2019Updated 7 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A variant of Transformer-XL where the memory is updated not with a queue, but with attention☆49Jul 31, 2020Updated 5 years ago
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Jul 13, 2022Updated 3 years ago
- ☆11Jul 5, 2020Updated 5 years ago
- Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms☆20Nov 29, 2021Updated 4 years ago
- ☆11Oct 16, 2020Updated 5 years ago
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Aug 17, 2020Updated 5 years ago
- 🤗An unofficial PyTorch implementation of ConvBert based on huggingface/transformers.☆17Oct 6, 2022Updated 3 years ago
- ☆25Jun 24, 2021Updated 4 years ago
- Implementing MixNet: Mixed Depthwise Convolutional Kernels using Pytorch☆62Jun 11, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Unsupervised Key-phrase Extraction and Clustering for Classification Scheme in Scientific Publications.☆19May 24, 2021Updated 5 years ago
- Implementation of the paper "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting", https://arxi…☆19Jul 20, 2021Updated 4 years ago
- Code for Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution (ACL2021)☆13Jun 2, 2021Updated 4 years ago
- Code for our SIGIR 2021 short paper "Lighter and Better: Low-Rank Decomposed Self-Attention Networks for Next-Item Recommendation."☆15May 5, 2021Updated 5 years ago
- ☆98Apr 27, 2022Updated 4 years ago
- Learn to Resolve Conversational Dependency: A Consistency Training Framework for Conversational Question Answering (Kim et al., ACL 2021)☆32Jan 2, 2023Updated 3 years ago
- ☆27Jun 23, 2020Updated 5 years ago
- Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".☆21Aug 31, 2021Updated 4 years ago
- ☆10Nov 15, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is an official implementation of our CVPR 2020 paper "Non-Local Neural Networks With Grouped Bilinear Attentional Transforms".☆13Jan 30, 2021Updated 5 years ago
- ☆25May 21, 2018Updated 8 years ago
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆62Sep 17, 2025Updated 8 months ago
- (ICCV 2021) BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search☆143Dec 6, 2021Updated 4 years ago
- codes for DUMA: Reading Comprehension with Transposition Thinking☆13Aug 10, 2022Updated 3 years ago
- Bottleneck Transformers for Visual Recognition☆279Mar 14, 2021Updated 5 years ago
- Source code for <Sequence-Level Training for Non-Autoregressive Neural Machine Translation>.☆24Jan 17, 2022Updated 4 years ago
- DuReader bert Chinese MRC☆14Nov 18, 2022Updated 3 years ago
- A long version of BART model based on Longformer model☆24Jun 12, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- This is the repository for our WSDM 2020 publication: Interpretable Click-through Rate Prediction through Hierarchical Attention☆40Oct 29, 2019Updated 6 years ago
- The implementation of "Does Multi-Encoder Help? A Case Study on Context-AwareNeural Machine Translation"☆39Aug 26, 2020Updated 5 years ago
- [KDD'22] Learned Token Pruning for Transformers☆98Feb 27, 2023Updated 3 years ago
- ☆14Feb 7, 2020Updated 6 years ago
- ☆14Jul 27, 2022Updated 3 years ago
- An easy-to-use tool for phrase encoding and topic mining (unsupervised aspect extraction); Code base for ACL 2022 paper, UCTopic: Unsuper…☆46Apr 25, 2023Updated 3 years ago