Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"
☆147Jun 10, 2019Updated 6 years ago
Alternatives and similar repositories for macaron-net
Users that are interested in macaron-net are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code for "Efficient Training of BERT by Progressively Stacking"☆112Jul 3, 2019Updated 6 years ago
- Source code for "A Lightweight Recurrent Network for Sequence Modeling"☆26Dec 7, 2022Updated 3 years ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆22Jan 25, 2023Updated 3 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆120Jun 20, 2021Updated 4 years ago
- ☆20Feb 26, 2021Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"☆580Aug 28, 2019Updated 6 years ago
- Paper List For Linking ODE and Deep Learning☆245Feb 18, 2020Updated 6 years ago
- A tensorflow implementation of the NIPS 2018 paper "Variational Inference with Tail-adaptive f-Divergence"☆20Jan 11, 2019Updated 7 years ago
- code for Explicit Sparse Transformer☆61Jul 21, 2023Updated 2 years ago
- Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)☆100Oct 17, 2022Updated 3 years ago
- Repository for ACL 2019 paper☆74Jun 30, 2019Updated 6 years ago
- The implementation of "Learning Deep Transformer Models for Machine Translation"☆116Jul 25, 2024Updated last year
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆15Jun 28, 2025Updated 8 months ago
- Some good(maybe) papers about NMT (Neural Machine Translation).☆85Jan 15, 2020Updated 6 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Latent Alignment and Variational Attention☆329Nov 5, 2018Updated 7 years ago
- Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"☆127Apr 5, 2021Updated 4 years ago
- Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design☆20Jul 26, 2023Updated 2 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- ☆14May 7, 2019Updated 6 years ago
- Tensorflow Source code for "Recurrently Controlled Recurrent Networks" (NIPS 2018)☆23Oct 25, 2018Updated 7 years ago
- ☆10Feb 12, 2020Updated 6 years ago
- Code for the article "Automatic Temperature Control for Neural Machine Translation" (EMNLP 2018)☆14Apr 16, 2019Updated 6 years ago
- ☆10Jun 14, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for ACL2020 "Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation"☆39Jun 24, 2020Updated 5 years ago
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Jan 7, 2025Updated last year
- ☆14Nov 16, 2022Updated 3 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- ICLR2020 Downloader & Search Tool☆18Oct 8, 2019Updated 6 years ago
- Learnable Embedding Space for Efficient Neural Architecture Compression☆29Apr 25, 2019Updated 6 years ago
- Extend bert-nmt to context-aware translation.☆11May 24, 2021Updated 4 years ago
- Experiments with Neural ODEs and Adversarial Attacks☆44Jan 13, 2019Updated 7 years ago
- Implementation of the Optimal Completion Distillation for Sequence Labeling☆17Jul 25, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code for NIPS 2018 paper 'Frequency-Agnostic Word Representation'☆115May 2, 2019Updated 6 years ago
- Pytorch library for fast transformer implementations☆1,765Mar 23, 2023Updated 3 years ago
- code for paper "Improving Sequence-to-Sequence Learning via Optimal Transport"☆68Jun 24, 2019Updated 6 years ago
- Solution of KDD cup 2021☆11Jun 16, 2021Updated 4 years ago
- Experiments from the paper "On Second Order Behaviour in Augmented Neural ODEs"☆61Sep 30, 2024Updated last year
- Codes for <Kernelized Bayesian Softmax for Text Generation> in NeurIPS 2019☆16Nov 20, 2019Updated 6 years ago
- (Batched) advanced indexing for PyTorch.☆53Dec 26, 2024Updated last year