zhuohan123/macaron-net

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhuohan123/macaron-net)

zhuohan123 / macaron-net

Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"

☆147

Alternatives and similar repositories for macaron-net

Users that are interested in macaron-net are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

gonglinyuan / StackingBERT
View on GitHub
Source code for "Efficient Training of BERT by Progressively Stacking"
☆112Jul 3, 2019Updated 7 years ago
bzhangGo / lrn
View on GitHub
Source code for "A Lightweight Recurrent Network for Sequence Modeling"
☆26Dec 7, 2022Updated 3 years ago
MultiPath / Squirrel
View on GitHub
PyTorch implementation of Transformer-based Neural Machine Translation
☆77Dec 14, 2022Updated 3 years ago
apeterswu / Depth_Growing_NMT
View on GitHub
ACL19_Depth_Growing_for_Neural_Machine_Translation
☆23Jul 6, 2019Updated 7 years ago
sIncerass / powernorm
View on GitHub
[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845
☆120Jun 20, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
pytorch-tpu / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆22Jan 25, 2023Updated 3 years ago
zhuohan123 / g2-lstm
View on GitHub
Codes for "Towards Binary-Valued Gates for Robust LSTM Training".
☆75Jul 22, 2018Updated 8 years ago
instance-wise-ordered-transformer / IOT
View on GitHub
☆20Feb 26, 2021Updated 5 years ago
XueruiSu / Trust-Region-Preference-Approximation
View on GitHub
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
☆15Jun 28, 2025Updated last year
yikangshen / Ordered-Neurons
View on GitHub
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"
☆580Aug 28, 2019Updated 6 years ago
2prime / ODE-DL
View on GitHub
Paper List For Linking ODE and Deep Learning
☆244Feb 18, 2020Updated 6 years ago
dilinwang820 / adaptive-f-divergence
View on GitHub
A tensorflow implementation of the NIPS 2018 paper "Variational Inference with Tail-adaptive f-Divergence"
☆20Jan 11, 2019Updated 7 years ago
lancopku / Explicit-Sparse-Transformer
View on GitHub
code for Explicit Sparse Transformer
☆61Jul 21, 2023Updated 3 years ago
henryhungle / MTN
View on GitHub
Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)
☆100Oct 17, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
vanzytay / QuaternionTransformers
View on GitHub
Repository for ACL 2019 paper
☆74Jun 30, 2019Updated 7 years ago
wangqiangneu / dlcl
View on GitHub
The implementation of "Learning Deep Transformer Models for Machine Translation"
☆116Jul 25, 2024Updated 2 years ago
harvardnlp / var-attn
View on GitHub
Latent Alignment and Variational Attention
☆327Nov 5, 2018Updated 7 years ago
yzh119 / BPT
View on GitHub
Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"
☆127Apr 5, 2021Updated 5 years ago
HankerWu / TamGent
View on GitHub
Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design
☆20Jul 26, 2023Updated 3 years ago
SITE5039 / AdaMixUp
View on GitHub
☆14May 7, 2019Updated 7 years ago
facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
vanzytay / NIPS2018_RCRN
View on GitHub
Tensorflow Source code for "Recurrently Controlled Recurrent Networks" (NIPS 2018)
☆23Oct 25, 2018Updated 7 years ago
zhuohan123 / hint-nart
View on GitHub
☆10Feb 12, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lancopku / SACT
View on GitHub
Code for the article "Automatic Temperature Control for Neural Machine Translation" (EMNLP 2018)
☆14Apr 16, 2019Updated 7 years ago
a1600012888 / YOPO-You-Only-Propagate-Once
View on GitHub
Code for our nips19 paper: You Only Propagate Once: Accelerating Adversarial Training Via Maximal Principle
☆180Jul 25, 2024Updated 2 years ago
LiyuanLucasLiu / Transformer-Clinic
View on GitHub
Understanding the Difficulty of Training Transformers
☆332May 31, 2022Updated 4 years ago
lemmonation / jm-nat
View on GitHub
Code for ACL2020 "Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation"
☆39Jun 24, 2020Updated 6 years ago
ustctf-zz / delibnet
View on GitHub
☆14Nov 16, 2022Updated 3 years ago
AminJun / ICLR2020
View on GitHub
ICLR2020 Downloader & Search Tool
☆18Oct 8, 2019Updated 6 years ago
Shengcao-Cao / ESNAC
View on GitHub
Learnable Embedding Space for Efficient Neural Architecture Compression
☆29Apr 25, 2019Updated 7 years ago
rajatvd / NeuralODE
View on GitHub
Experiments with Neural ODEs and Adversarial Attacks
☆45Jan 13, 2019Updated 7 years ago
bert-nmt / ctx-bert-nmt
View on GitHub
Extend bert-nmt to context-aware translation.
☆11May 24, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SaeedNajafi / pytorch-ocd
View on GitHub
Implementation of the Optimal Completion Distillation for Sequence Labeling
☆17Jul 25, 2024Updated 2 years ago
ChengyueGongR / Frequency-Agnostic
View on GitHub
Code for NIPS 2018 paper 'Frequency-Agnostic Word Representation'
☆115May 2, 2019Updated 7 years ago
idiap / fast-transformers
View on GitHub
Pytorch library for fast transformer implementations
☆1,775Mar 23, 2023Updated 3 years ago
vacancy / AdvancedIndexing-PyTorch
View on GitHub
(Batched) advanced indexing for PyTorch.
☆54Dec 26, 2024Updated last year
dilinwang820 / nonlinear_svgd
View on GitHub
Nonlinear SVGD for Learning Diversified Mixture Models
☆13Jan 23, 2019Updated 7 years ago
ChenhongyiYang / CCOP
View on GitHub
Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning
☆20Feb 4, 2022Updated 4 years ago
minicheshire / InResNet
View on GitHub
Interpolation between Residual and Non-Residual Networks, ICML 2020. https://arxiv.org/abs/2006.05749
☆26Aug 16, 2020Updated 5 years ago