My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
☆44Dec 12, 2024Updated last year
Alternatives and similar repositories for Rethinking-attention
Users that are interested in Rethinking-attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Image to LaTeX pytorch model☆14Jul 6, 2023Updated 2 years ago
- [ICCV 23] MolGrapher: Graph-based Visual Recognition of Chemical Structures☆16Oct 27, 2025Updated 5 months ago
- ☆21Feb 23, 2023Updated 3 years ago
- Cheminformatic analysis of small molecule type drugs in DrugBank for their ability to form nanoparticles with indocyanine dyes.☆11Apr 30, 2018Updated 7 years ago
- source for paper DGNN-DDI☆11Oct 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Learning Safety Constraints for Large Language Models (ICML2025)☆34Aug 4, 2025Updated 8 months ago
- A counterfactual collaborative session-based recommender system. WWW'23.☆10Nov 10, 2023Updated 2 years ago
- [WWW '24] UnifiedSSR: A Unified Framework of Sequential Search and Recommendation☆12Feb 16, 2024Updated 2 years ago
- ☆11Mar 8, 2024Updated 2 years ago
- [AAAI24]Transformer-based relation-aware graph representation learning framework for DDI prediction☆16Jan 25, 2024Updated 2 years ago
- Fourier Spatial-Temporal Network for Multivariate Time Series Forecasting☆11Jan 1, 2023Updated 3 years ago
- ICPR2022: Dynamic Data Augmentation with Gating Networks for Time Series Recognition☆11Jul 28, 2022Updated 3 years ago
- The code for WWW2024 paper "Rethinking Cross-Domain Sequential Recommendation under Open-World Assumptions".☆36Aug 12, 2024Updated last year
- ☆13Mar 28, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Behavior-Contextualized Item Preference Network for Multi-Behavior Recommendation☆16Nov 8, 2024Updated last year
- [SIGIR 2024] NFARec: A Negative Feedback-Aware Recommender Model.☆12Jan 9, 2025Updated last year
- [IJCV 2023] FlowNAS: Neural Architecture Search for Optical Flow Estimation☆16Feb 21, 2024Updated 2 years ago
- ☆14Jun 22, 2022Updated 3 years ago
- 使用深圳市的出租车轨迹数据实现GCN时间序列预测☆11Nov 9, 2023Updated 2 years ago
- ☆17Aug 31, 2022Updated 3 years ago
- The repo for reproducing the main results in TSMixer: An all-MLP Architecture for Time Series Forecasting.☆10Jun 15, 2023Updated 2 years ago
- An implementation of LazyLLM token pruning for LLaMa 2 model family.☆13Jan 6, 2025Updated last year
- This is the code of our work Are More Layers Beneficial to Graph Transformers? published on ICLR 2023.☆21May 27, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Comparing retrieval abilities from GPT4-Turbo and a RAG system on a toy example for various context lengths☆35Dec 1, 2023Updated 2 years ago
- ☆18Jul 6, 2024Updated last year
- CNN+KAN architecture on MNIST (Val Acc: 96%)☆13May 4, 2024Updated last year
- Improving Recommendation Fairness via Data Augmentation-WWW23☆15Jun 6, 2023Updated 2 years ago
- Given an image of a molecule create a smiles or mol represenatation.☆25May 28, 2021Updated 4 years ago
- ☆16Nov 16, 2025Updated 5 months ago
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12May 24, 2023Updated 2 years ago
- Time Series Representation Models☆13Jul 17, 2025Updated 8 months ago
- This repository is an official PyTorch implementation of our paper "Feature Distillation Interaction Weighting Network for Lightweight Im…☆13May 6, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- repository for "Adaptive Disentangled Transformer for Sequential Recommendation"☆15Jun 6, 2023Updated 2 years ago
- GAM-Depth: Self-Supervised Indoor Depth Estimation Leveraging a Gradient-Aware Mask and Semantic Constraints☆12Apr 15, 2024Updated 2 years ago
- Unsupervised learning of Moving MNIST dataset. This repository contains implemention of ConvLSTM model and PredRNN++ model with Pytorch.☆14May 20, 2022Updated 3 years ago
- ☆13May 12, 2025Updated 11 months ago
- Open source codes of "Spatio-Temporal Probabilistic Forecasting of Photovoltaic Power Based on Monotone Broad Learning System and Copula …☆15Feb 14, 2022Updated 4 years ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- The code of AAAI'24 paper GLRec: Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations☆38Dec 21, 2023Updated 2 years ago