A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"
☆74Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for Synthesizer
Users that are interested in Synthesizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch☆71May 28, 2020Updated 5 years ago
- https://challenge.enliple.com/☆16Jun 10, 2020Updated 5 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆11May 27, 2022Updated 3 years ago
- ☆10Apr 2, 2022Updated 4 years ago
- Code for ACL 2022 paper "HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization".☆13May 24, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Fair Embedding Engine☆14Oct 25, 2020Updated 5 years ago
- Implementation of RealFormer using pytorch☆101Dec 27, 2020Updated 5 years ago
- Convolutional Fine-Grained Classification with Self-Supervised Target Relation Regularization (TIP 2022)☆12Sep 8, 2022Updated 3 years ago
- Code for the TACL paper "Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings"☆16Sep 8, 2020Updated 5 years ago
- Visual Transformers with Primal Object Queries for Multi-Label Image Classification☆12May 17, 2022Updated 3 years ago
- M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021☆16Oct 27, 2021Updated 4 years ago
- KoBART chatbot☆45Jun 22, 2021Updated 4 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆69Sep 19, 2021Updated 4 years ago
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for Multi-Head Attention: Collaborate Instead of Concatenate☆152Jun 12, 2023Updated 2 years ago
- ☆17Oct 19, 2021Updated 4 years ago
- 문서 요약 논문 정리☆15Oct 27, 2021Updated 4 years ago
- Open Source + Multilingual MLLM + Fine-tuning + Distillation + More efficient models and learning + ?☆18Jan 31, 2025Updated last year
- ☆18Apr 11, 2021Updated 5 years ago
- LM pretraining for generation, reading list, resources, conference mappings.☆20Feb 25, 2020Updated 6 years ago
- ☆25Jul 15, 2023Updated 2 years ago
- annotated-transformer-kr☆15May 16, 2019Updated 6 years ago
- Deploy Pytorch models to production via panini☆10Mar 18, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- BERT baselines for extractive question answering on coqa (https://stanfordnlp.github.io/coqa/)☆10Jan 27, 2020Updated 6 years ago
- Style Transfer by Rigid Alignment in Neural Net Feature Space☆11Jan 23, 2021Updated 5 years ago
- Poly-encoder architecture and pre-training pipeline implementation (pytorch)☆16Jun 29, 2020Updated 5 years ago
- A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering☆43Nov 8, 2020Updated 5 years ago
- NeurIPS 2019 Paper Implementation☆12Nov 22, 2022Updated 3 years ago
- [TIP] Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition☆45Apr 12, 2023Updated 3 years ago
- Implementation of the GLOM model for text☆11Mar 4, 2021Updated 5 years ago
- ☆20Nov 11, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The code of ACL 2020 paper "You Impress Me: Dialogue Generation via Mutual Persona Perception"☆307Oct 27, 2023Updated 2 years ago
- Code and data recipes for the paper: Heterogeneous Target Speech Separation☆43Dec 6, 2022Updated 3 years ago
- Train your own GPT2!☆14Apr 11, 2023Updated 3 years ago
- Chapter 9: Attention and Memory Augmented Networks☆13Jul 23, 2019Updated 6 years ago
- ☆24Nov 22, 2022Updated 3 years ago
- Question Answering In Context☆28Nov 24, 2022Updated 3 years ago
- ☆13Jul 13, 2022Updated 3 years ago