Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"
☆147Jun 10, 2019Updated 6 years ago
Alternatives and similar repositories for macaron-net
Users that are interested in macaron-net are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code for "Efficient Training of BERT by Progressively Stacking"☆112Jul 3, 2019Updated 6 years ago
- Source code for "A Lightweight Recurrent Network for Sequence Modeling"☆26Dec 7, 2022Updated 3 years ago
- ACL19_Depth_Growing_for_Neural_Machine_Translation☆23Jul 6, 2019Updated 6 years ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆22Jan 25, 2023Updated 3 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆120Jun 20, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆20Feb 26, 2021Updated 5 years ago
- Codes for "Towards Binary-Valued Gates for Robust LSTM Training".☆75Jul 22, 2018Updated 7 years ago
- Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"☆579Aug 28, 2019Updated 6 years ago
- Paper List For Linking ODE and Deep Learning☆244Feb 18, 2020Updated 6 years ago
- Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)☆100Oct 17, 2022Updated 3 years ago
- code for Explicit Sparse Transformer☆61Jul 21, 2023Updated 2 years ago
- Repository for ACL 2019 paper☆75Jun 30, 2019Updated 6 years ago
- The implementation of "Learning Deep Transformer Models for Machine Translation"☆116Jul 25, 2024Updated last year
- Some good(maybe) papers about NMT (Neural Machine Translation).☆85Jan 15, 2020Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"☆127Apr 5, 2021Updated 5 years ago
- Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design☆20Jul 26, 2023Updated 2 years ago
- ☆14May 7, 2019Updated 6 years ago
- Tensorflow Source code for "Recurrently Controlled Recurrent Networks" (NIPS 2018)☆23Oct 25, 2018Updated 7 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- Code for the article "Automatic Temperature Control for Neural Machine Translation" (EMNLP 2018)☆14Apr 16, 2019Updated 7 years ago
- ☆10Jun 14, 2023Updated 2 years ago
- Code for our nips19 paper: You Only Propagate Once: Accelerating Adversarial Training Via Maximal Principle☆180Jul 25, 2024Updated last year
- Code for ACL2020 "Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation"☆39Jun 24, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆16Jan 7, 2025Updated last year
- ☆14Nov 16, 2022Updated 3 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- Learnable Embedding Space for Efficient Neural Architecture Compression☆29Apr 25, 2019Updated 6 years ago
- Experiments with Neural ODEs and Adversarial Attacks☆44Jan 13, 2019Updated 7 years ago
- Implementation of the Optimal Completion Distillation for Sequence Labeling☆17Jul 25, 2024Updated last year
- Code for NIPS 2018 paper 'Frequency-Agnostic Word Representation'☆115May 2, 2019Updated 6 years ago
- Pytorch library for fast transformer implementations☆1,767Mar 23, 2023Updated 3 years ago
- code for paper "Improving Sequence-to-Sequence Learning via Optimal Transport"☆68Jun 24, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Solution of KDD cup 2021☆11Jun 16, 2021Updated 4 years ago
- Codes for <Kernelized Bayesian Softmax for Text Generation> in NeurIPS 2019☆16Nov 20, 2019Updated 6 years ago
- Experiments from the paper "On Second Order Behaviour in Augmented Neural ODEs"☆61Sep 30, 2024Updated last year
- STCN: Stochastic Temporal Convolutional Networks☆69Jul 15, 2020Updated 5 years ago
- (Batched) advanced indexing for PyTorch.☆53Dec 26, 2024Updated last year
- Nonlinear SVGD for Learning Diversified Mixture Models☆13Jan 23, 2019Updated 7 years ago
- Interpolation between Residual and Non-Residual Networks, ICML 2020. https://arxiv.org/abs/2006.05749☆26Aug 16, 2020Updated 5 years ago