Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆71May 28, 2020Updated 6 years ago
Alternatives and similar repositories for Synthesizer-Rethinking-Self-Attention-Transformer-Models
Users that are interested in Synthesizer-Rethinking-Self-Attention-Transformer-Models are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆75Dec 8, 2022Updated 3 years ago
- Source code for "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation"☆18Aug 31, 2019Updated 6 years ago
- Implementing Randomly Wired Neural Networks for Image Recognition, Using CIFAR-10 dataset, CIFAR-100 dataset☆88May 26, 2019Updated 7 years ago
- A variant of Transformer-XL where the memory is updated not with a queue, but with attention☆49Jul 31, 2020Updated 5 years ago
- KANs and MLPs☆12Jun 7, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms☆20Nov 29, 2021Updated 4 years ago
- tensorflow implementation of Exploring Randomly Wired Neural Networks for Image Recognition☆32Dec 16, 2019Updated 6 years ago
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Aug 17, 2020Updated 5 years ago
- ☆25Jun 24, 2021Updated 4 years ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆86Mar 7, 2023Updated 3 years ago
- Implementing MixNet: Mixed Depthwise Convolutional Kernels using Pytorch☆62Jun 11, 2020Updated 6 years ago
- Unsupervised Key-phrase Extraction and Clustering for Classification Scheme in Scientific Publications.☆19May 24, 2021Updated 5 years ago
- CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)☆85Jan 19, 2021Updated 5 years ago
- Implementation of the paper "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting", https://arxi…☆19Jul 20, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆25Jun 23, 2020Updated 5 years ago
- Code for our SIGIR 2021 short paper "Lighter and Better: Low-Rank Decomposed Self-Attention Networks for Next-Item Recommendation."☆15May 5, 2021Updated 5 years ago
- ☆98Apr 27, 2022Updated 4 years ago
- Learn to Resolve Conversational Dependency: A Consistency Training Framework for Conversational Question Answering (Kim et al., ACL 2021)☆32Jan 2, 2023Updated 3 years ago
- ☆19Mar 17, 2021Updated 5 years ago
- A set of of fundamental operations and deep learning models using JAX☆15Mar 12, 2021Updated 5 years ago
- ☆10Nov 15, 2020Updated 5 years ago
- ☆18Aug 9, 2018Updated 7 years ago
- Benchmarking various sparse convolution libraries: MinkowskiEngine, SpConv, TorchSparse, and Open3D.☆13Apr 10, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Multi-Head Attention, Transformer, Perceiver, Linear Attention.☆12Oct 24, 2023Updated 2 years ago
- pytorch implementation of XMC-GAN☆11Jun 2, 2021Updated 5 years ago
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆62Sep 17, 2025Updated 9 months ago
- (ICCV 2021) BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search☆143Dec 6, 2021Updated 4 years ago
- CP-GAN: Class-Distinct and Class-Mutual Image Generation with GANs☆15Jun 19, 2021Updated 4 years ago
- How to design a MIPI CSI interface with Efinix Trion FPGA T20F169 QUICKLY☆10Feb 6, 2020Updated 6 years ago
- Bottleneck Transformers for Visual Recognition☆279Mar 14, 2021Updated 5 years ago
- Source code for <Sequence-Level Training for Non-Autoregressive Neural Machine Translation>.☆24Jan 17, 2022Updated 4 years ago
- [FPGA'21] CoDeNet is an efficient object detection model on PyTorch, with SOTA performance on VOC and COCO based on CenterNet and Co-Desi…☆28Feb 7, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- DuReader bert Chinese MRC☆14Nov 18, 2022Updated 3 years ago
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- This is the repository for our WSDM 2020 publication: Interpretable Click-through Rate Prediction through Hierarchical Attention☆40Oct 29, 2019Updated 6 years ago
- The implementation of "Does Multi-Encoder Help? A Case Study on Context-AwareNeural Machine Translation"☆39Aug 26, 2020Updated 5 years ago
- [KDD'22] Learned Token Pruning for Transformers☆98Feb 27, 2023Updated 3 years ago
- Open Source + Multilingual MLLM + Fine-tuning + Distillation + More efficient models and learning + ?☆18Jan 31, 2025Updated last year
- Transformer-based approaches for an efficient docstrings generation on a piece of Python's code.☆17Feb 16, 2026Updated 4 months ago