[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, Zhangyang Wang and Jingjing Liu
☆18Dec 30, 2021Updated 4 years ago
Alternatives and similar repositories for EarlyBERT
Users that are interested in EarlyBERT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- Code for our AAAI2021 paper: Token-Aware Virtual Adversarial Training For Language Understanding.☆25Dec 3, 2020Updated 5 years ago
- Code of Robust Lottery Tickets for Pre-trained Language Models (ACL2022)☆20Jul 18, 2022Updated 3 years ago
- [ICML2023] Instant Soup Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models. Ajay Jaiswal, Shiwei Liu, Ti…☆11Nov 28, 2023Updated 2 years ago
- ☆26Nov 23, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A Theano implementation of a CNN DSEBM (deep structured energy-based model) described in https://arxiv.org/pdf/1605.07717v2.pdf☆10Oct 13, 2016Updated 9 years ago
- [NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”, Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangya…☆29Dec 30, 2021Updated 4 years ago
- [ICLR 2021] "Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, S…☆26Dec 30, 2021Updated 4 years ago
- [TMLR] "Adversarial Feature Augmentation and Normalization for Visual Recognition", Tianlong Chen, Yu Cheng, Zhe Gan, Jianfeng Wang, Liju…☆21Nov 27, 2022Updated 3 years ago
- Source code for paper on commonsense reasoning for 2020 Annual Conference of the Association for Computational Linguistics (ACL) 2020.☆29Aug 2, 2024Updated last year
- Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)☆26Jan 17, 2026Updated 4 months ago
- [CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong C…☆25Mar 9, 2022Updated 4 years ago
- [NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…☆142Dec 30, 2021Updated 4 years ago
- ☆18Nov 5, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆16Apr 14, 2021Updated 5 years ago
- 基于CUDA的GPU加速通用遗传算法实现,实验平台为Nvidia Jetson Nano☆13Mar 23, 2023Updated 3 years ago
- ☆28Sep 28, 2021Updated 4 years ago
- The official code repository for the FullFront benchmark☆27May 16, 2025Updated last year
- ☆15Nov 7, 2024Updated last year
- Adversarial Training for Natural Language Understanding☆252Sep 6, 2023Updated 2 years ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Jul 30, 2024Updated last year
- 校色文件☆11Aug 27, 2020Updated 5 years ago
- Implementation for "An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Pl…☆17Oct 10, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch implementation of Language model compression with weighted low-rank factorization☆14Jun 28, 2023Updated 2 years ago
- ☆14Apr 16, 2024Updated 2 years ago
- ☆21Jul 5, 2024Updated last year
- [NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…☆15Oct 18, 2022Updated 3 years ago
- [NeurIPS'21] "Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly", Tianlong Chen, Yu Cheng, Zhe …☆84Dec 30, 2021Updated 4 years ago
- 🔥 🔥 [WACV2024] Mini but Mighty: Finetuning ViTs with Mini Adapters☆20Jul 5, 2024Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- [NeurIPS 2020] "FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training" by Yonggan Fu, Ha…☆10Feb 13, 2022Updated 4 years ago
- Adversarial Category Alignment Network for Cross-domain Sentiment Classification (NAACL 2019)☆23Jul 4, 2019Updated 6 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- LogiTorch is a PyTorch-based library for logical reasoning on natural language☆73Oct 10, 2025Updated 7 months ago
- ☆25Dec 13, 2024Updated last year
- A complete Linux project for the ZYBO. This project helps me during my first steps with embedded Linux. You can find anything necessary t…☆13Oct 8, 2020Updated 5 years ago
- This repository contains the implementation of the paper "MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models".☆25May 28, 2025Updated 11 months ago
- triton ver of gqa flash attn, based on the tutorial☆12Aug 4, 2024Updated last year
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Dec 16, 2020Updated 5 years ago
- Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)☆19Jun 12, 2024Updated last year