code for Explicit Sparse Transformer
☆60Jul 21, 2023Updated 2 years ago
Alternatives and similar repositories for Explicit-Sparse-Transformer
Users that are interested in Explicit-Sparse-Transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the article "Automatic Temperature Control for Neural Machine Translation" (EMNLP 2018)☆14Apr 16, 2019Updated 7 years ago
- Code associated with the paper **SkipBERT: Efficient Inference with Shallow Layer Skipping**, at ACL 2022☆16Jun 22, 2022Updated 4 years ago
- Tool for Evaluating Adversarial Perturbations on Text☆61Feb 27, 2022Updated 4 years ago
- [Findings of EMNLP22] From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models☆19Mar 16, 2023Updated 3 years ago
- Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit☆28Aug 19, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for reproducing key results in the paper "Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time" by Kārlis Freiv…☆11Apr 10, 2020Updated 6 years ago
- Sparse Attention with Linear Units☆20Apr 21, 2021Updated 5 years ago
- ☆13Nov 23, 2019Updated 6 years ago
- The entmax mapping and its loss, a family of sparse softmax alternatives.☆475Jun 22, 2024Updated 2 years ago
- Implementation of RealFormer using pytorch☆101Dec 27, 2020Updated 5 years ago
- Implement attention model to LSTM using TensorFlow☆10Jul 3, 2018Updated 8 years ago
- Leveraging Local and Global Patterns for Self-Attention Networks☆12Jun 3, 2019Updated 7 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆34Jun 11, 2025Updated last year
- ☆23Mar 18, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official code repository of paper Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency.☆20Jan 18, 2025Updated last year
- Data and code for paper "Review-Driven Multi-Label Music Style Classification by Exploiting Style Correlations"☆17Jun 30, 2019Updated 7 years ago
- The augmented data of the paper "Parallel Data Augmentation for Formality Style Transfer" (ACL 2020).☆12May 14, 2020Updated 6 years ago
- Neural Text Generation with Unlikelihood Training☆311Aug 31, 2021Updated 4 years ago
- Spectral Attention Autoregressive Model (SAAM)☆17Oct 27, 2022Updated 3 years ago
- Subgraph-augmented Path Embedding for Semantic User Search on Heterogeneous Social Network☆13Feb 19, 2018Updated 8 years ago
- ☆14Jan 5, 2022Updated 4 years ago
- Contrastive evaluation of pronoun translation in neural machine translation☆26Aug 22, 2019Updated 6 years ago
- This repo contains all the codes for SEScore implementation☆15Mar 3, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Data and code used in our NAACL'19 paper "Selective Attention for Context-aware Neural Machine Translation"☆30Apr 12, 2020Updated 6 years ago
- ☆11Dec 27, 2022Updated 3 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆70Sep 19, 2021Updated 4 years ago
- Unofficial reimplementation of Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering☆17Oct 30, 2019Updated 6 years ago
- Official implementation of paper "Vision Graph Prompting via Semantic Low-Rank Decomposition", ICML 2025☆16Dec 25, 2025Updated 6 months ago
- FLASHQuad_pytorch☆68Apr 1, 2022Updated 4 years ago
- Code for "Understanding and Improving Layer Normalization"☆46Dec 8, 2019Updated 6 years ago
- Unpaired Image Captioning☆36Mar 25, 2021Updated 5 years ago
- Official implementation of paper "GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model", ICML 2025☆17Dec 25, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Jump to better conclusions: SCAN both left and right☆11Jan 24, 2019Updated 7 years ago
- A pytorch implementation of our paper Image Captioning with Inherent Sentiment (ICME 2021 Oral).☆11Jul 18, 2022Updated 3 years ago
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,615Aug 12, 2020Updated 5 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 4 years ago
- Implementation for paper "A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation"☆24Mar 1, 2020Updated 6 years ago
- ☆36Oct 3, 2018Updated 7 years ago
- Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)☆65Oct 19, 2020Updated 5 years ago