[NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin
☆142Dec 30, 2021Updated 4 years ago
Alternatives and similar repositories for BERT-Tickets
Users that are interested in BERT-Tickets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…☆15Oct 18, 2022Updated 3 years ago
- [ICLR 2021] "Learning a Minimax Optimizer: A Pilot Study" by Jiayi Shen*, Xiaohan Chen*, Howard Heaton*, Tianlong Chen, Jialin Liu, Wotao…☆15Dec 30, 2021Updated 4 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃☆117Oct 27, 2022Updated 3 years ago
- Block Sparse movement pruning☆83Nov 26, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Jun 9, 2021Updated 4 years ago
- The official code repository for MetricMT - a reward optimization method for NMT with learned metrics☆25Apr 24, 2021Updated 5 years ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Dec 30, 2021Updated 4 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- ☆20Dec 16, 2020Updated 5 years ago
- [CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jon…☆68Dec 17, 2022Updated 3 years ago
- ☆22Apr 21, 2021Updated 5 years ago
- ☆32Sep 27, 2021Updated 4 years ago
- Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)☆19Jul 28, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [ICLR 2021] "Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, S…☆26Dec 30, 2021Updated 4 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆33Jun 14, 2023Updated 2 years ago
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan☆14Oct 18, 2022Updated 3 years ago
- A repository in preparation for open-sourcing lottery ticket hypothesis code.☆640Sep 6, 2022Updated 3 years ago
- Cascaded Text Generation with Markov Transformers☆130Mar 20, 2023Updated 3 years ago
- Repository for our ICLR 2019 paper: Discovery of Natural Language Concepts in Individual Units of CNNs☆26Mar 9, 2019Updated 7 years ago
- Code for "A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations" (NAACL 2019)☆67Mar 5, 2021Updated 5 years ago
- Meta Representation Transformation for Low-resource Cross-lingual Learning☆42May 5, 2021Updated 5 years ago
- Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.☆29Jul 23, 2021Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆19Oct 6, 2020Updated 5 years ago
- ☆13Aug 28, 2018Updated 7 years ago
- ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation☆25Oct 2, 2020Updated 5 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆198May 9, 2023Updated 2 years ago
- ☆45Oct 11, 2021Updated 4 years ago
- [Unofficial] Kakaotrans: Kakao translate API for python☆16Mar 29, 2020Updated 6 years ago
- ☆47Jan 21, 2021Updated 5 years ago
- [ICLR 2021] "GANs Can Play Lottery Too" by Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen☆26Feb 18, 2022Updated 4 years ago
- ☆179Jul 31, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240☆168Oct 7, 2022Updated 3 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Nov 2, 2020Updated 5 years ago
- PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"☆190Mar 8, 2021Updated 5 years ago
- Pytorch version of NIPS'16 "Learning to learn by gradient descent by gradient descent"☆69Jul 6, 2023Updated 2 years ago
- Transfer Learning in Dialogue Benchmarking Toolkit☆14Mar 31, 2023Updated 3 years ago
- A masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a…☆246Sep 17, 2021Updated 4 years ago
- ☆16Apr 11, 2022Updated 4 years ago