[NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin
☆142Dec 30, 2021Updated 4 years ago
Alternatives and similar repositories for BERT-Tickets
Users that are interested in BERT-Tickets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…☆15Oct 18, 2022Updated 3 years ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- [ICLR 2021] "Learning a Minimax Optimizer: A Pilot Study" by Jiayi Shen*, Xiaohan Chen*, Howard Heaton*, Tianlong Chen, Jialin Liu, Wotao…☆15Dec 30, 2021Updated 4 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated 11 months ago
- Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃☆117Oct 27, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Block Sparse movement pruning☆83Nov 26, 2020Updated 5 years ago
- [ICML 2021] "Efficient Lottery Ticket Finding: Less Data is More" by Zhenyu Zhang*, Xuxi Chen*, Tianlong Chen*, Zhangyang Wang☆26Dec 30, 2021Updated 4 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Jun 9, 2021Updated 4 years ago
- The official code repository for MetricMT - a reward optimization method for NMT with learned metrics☆25Apr 24, 2021Updated 4 years ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Dec 30, 2021Updated 4 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- ☆20Dec 16, 2020Updated 5 years ago
- [CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jon…☆68Dec 17, 2022Updated 3 years ago
- ☆22Apr 21, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang…☆89Dec 1, 2023Updated 2 years ago
- ☆17May 14, 2020Updated 5 years ago
- ☆32Sep 27, 2021Updated 4 years ago
- Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)☆19Jul 28, 2021Updated 4 years ago
- [ICLR 2021] "Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, S…☆26Dec 30, 2021Updated 4 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆33Jun 14, 2023Updated 2 years ago
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan☆14Oct 18, 2022Updated 3 years ago
- A repository in preparation for open-sourcing lottery ticket hypothesis code.☆636Sep 6, 2022Updated 3 years ago
- Cascaded Text Generation with Markov Transformers☆130Mar 20, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Repository for our ICLR 2019 paper: Discovery of Natural Language Concepts in Individual Units of CNNs☆26Mar 9, 2019Updated 7 years ago
- Code for "A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations" (NAACL 2019)☆67Mar 5, 2021Updated 5 years ago
- Meta Representation Transformation for Low-resource Cross-lingual Learning☆41May 5, 2021Updated 4 years ago
- Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.☆29Jul 23, 2021Updated 4 years ago
- ☆19Oct 6, 2020Updated 5 years ago
- ☆13Aug 28, 2018Updated 7 years ago
- ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation☆25Oct 2, 2020Updated 5 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆198May 9, 2023Updated 2 years ago
- ☆45Oct 11, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [Unofficial] Kakaotrans: Kakao translate API for python☆16Mar 29, 2020Updated 5 years ago
- ☆47Jan 21, 2021Updated 5 years ago
- Code and checkpoints of compressed networks for the paper titled "HYDRA: Pruning Adversarially Robust Neural Networks" (NeurIPS 2020) (ht…☆90Dec 22, 2022Updated 3 years ago
- [ICLR 2021] "GANs Can Play Lottery Too" by Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen☆26Feb 18, 2022Updated 4 years ago
- ☆178Jul 31, 2020Updated 5 years ago
- [NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240☆168Oct 7, 2022Updated 3 years ago
- PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"☆191Mar 8, 2021Updated 5 years ago