sIncerass/powernorm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sIncerass/powernorm)

sIncerass / powernorm

[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845

☆120

Alternatives and similar repositories for powernorm

Users that are interested in powernorm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jmzhao / pbos
View on GitHub
☆19Oct 10, 2020Updated 5 years ago
zhuohan123 / macaron-net
View on GitHub
Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"
☆147Jun 10, 2019Updated 7 years ago
layer6ai-labs / T-Fixup
View on GitHub
Code for the ICML'20 paper "Improving Transformer Optimization Through Better Initialization"
☆90Feb 1, 2021Updated 5 years ago
nestordemeure / AdaHessianJax
View on GitHub
Jax implementation of the AdaHessian optimizer
☆19Mar 11, 2021Updated 5 years ago
sacmehta / delight
View on GitHub
DeLighT: Very Deep and Light-Weight Transformers
☆469Oct 16, 2020Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
yoniaflalo / knapsack_pruning
View on GitHub
Implementation of knapsack pruning
☆27May 26, 2020Updated 6 years ago
zlinao / MinTL
View on GitHub
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
☆68Oct 26, 2021Updated 4 years ago
karndeb / NLP-Service
View on GitHub
☆13Aug 4, 2021Updated 4 years ago
chen42 / RatsPub
View on GitHub
Using PubMed to find out how a gene contributes to addiction.
☆20Dec 27, 2022Updated 3 years ago
md-experiments / elastic_transformers
View on GitHub
Making BERT stretchy. Semantic Elasticsearch with Sentence Transformers
☆160Sep 25, 2020Updated 5 years ago
jiangtaoxie / SoT
View on GitHub
SoT: Delving Deeper into Classification Head for Transformer
☆50Dec 24, 2021Updated 4 years ago
sheffieldnlp / ImitationLearningTutorialEACL2017
View on GitHub
Repo for the EACL2017 tutorial on imitation learning
☆28Apr 3, 2017Updated 9 years ago
e-commerce-search / bert2dnn
View on GitHub
Large Scale BERT Distillation
☆33Mar 24, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
changlin31 / BossNAS
View on GitHub
(ICCV 2021) BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
☆143Dec 6, 2021Updated 4 years ago
AlibabaPAI / one_shot_text_labeling
View on GitHub
code and data for paper "One-shot Text Field Labeling using Attention and BeliefPropagation for Structure Information Extraction"
☆61Aug 9, 2020Updated 5 years ago
piEsposito / transformers-low-code-experiments
View on GitHub
Low-code pre-built pipelines for experiments with huggingface/transformers for Data Scientists in a rush.
☆16Oct 14, 2020Updated 5 years ago
margaretmz / esrgan-e2e-tflite-tutorial
View on GitHub
ESRGAN E2E TFLite Tutorial
☆18Aug 3, 2020Updated 5 years ago
gonglinyuan / StackingBERT
View on GitHub
Source code for "Efficient Training of BERT by Progressively Stacking"
☆112Jul 3, 2019Updated 7 years ago
lancopku / AdaNorm
View on GitHub
Code for "Understanding and Improving Layer Normalization"
☆46Dec 8, 2019Updated 6 years ago
JiaxiongQ / SlimConv
View on GitHub
Reducing Channel Redundancy in Convolutional Neural Networks by Features Recombining (TIP 2021)
☆20Mar 1, 2023Updated 3 years ago
vtddggg / Robust-Vision-Transformer
View on GitHub
The implementation of our paper: Towards Robust Vision Transformer (CVPR2022)
☆142Aug 16, 2022Updated 3 years ago
merantix / acosp
View on GitHub
Semantic Segmentation in Pytorch
☆10Dec 9, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
gmh14 / RobNets
View on GitHub
[CVPR 2020] When NAS Meets Robustness: In Search of Robust Architectures against Adversarial Attacks
☆126Oct 21, 2020Updated 5 years ago
adriangonz / statistical-nlp-17
View on GitHub
Repository for group 17 on the Statistical Natural Language Processing module at UCL
☆23Aug 23, 2021Updated 4 years ago
mlpc-ucsd / BERT_Convolutions
View on GitHub
(ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.
☆21Jul 13, 2022Updated 4 years ago
singlasahil14 / CONV-SV
View on GitHub
A fast and efficient way to compute a differentiable bound on the singular values of convolution layers
☆12Nov 22, 2019Updated 6 years ago
ivanpanshin / hist_cancer
View on GitHub
Histopathologic Cancer Detection model based on Kaggle Challenge https://www.kaggle.com/c/histopathologic-cancer-detection (top 1%)
☆11Feb 16, 2021Updated 5 years ago
VITA-Group / Trap-and-Replace-Backdoor-Defense
View on GitHub
[NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,…
☆15Nov 27, 2023Updated 2 years ago
prajjwal1 / adaptive_transformer
View on GitHub
Code for the paper "Adaptive Transformers for Learning Multimodal Representations" (ACL SRW 2020)
☆43Oct 20, 2022Updated 3 years ago
ChengyueGongR / PatchVisionTransformer
View on GitHub
☆74Dec 8, 2022Updated 3 years ago
nlpyang / NoisySumm
View on GitHub
Codes for NAACL 2021 paper 'Noisy Self-Knowledge Distillation for Text Summarization'
☆24Jul 27, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sibbsnb / jina_hack_2020_search_stories
View on GitHub
☆17Sep 22, 2020Updated 5 years ago
Kid-key / MimicNorm
View on GitHub
☆19Jan 27, 2021Updated 5 years ago
majumderb / rezero
View on GitHub
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆414Jul 25, 2024Updated 2 years ago
gd-zhang / noisy-quadratic-model
View on GitHub
Large-batch Training, Neural Network Optimization
☆10Nov 8, 2019Updated 6 years ago
PacktPublishing / Implementing-Deep-Learning-Algorithms-with-TensorFlow-2.0
View on GitHub
☆11Jan 30, 2023Updated 3 years ago
ARM-gradient / ARSM
View on GitHub
Low-variance and unbiased gradient for backpropagation through categorical random variables, with application in variational auto-encoder…
☆17Jul 1, 2020Updated 6 years ago
naver-ai / pit
View on GitHub
☆245Jul 23, 2021Updated 5 years ago