☆17May 14, 2020Updated 5 years ago
Alternatives and similar repositories for bert-prune
Users that are interested in bert-prune are comparing it to the libraries listed below
Sorting:
- Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention☆51Oct 16, 2025Updated 5 months ago
- ☆11Jan 21, 2020Updated 6 years ago
- Block Sparse movement pruning☆83Nov 26, 2020Updated 5 years ago
- Reproduction instructions for "Rapid Adaptation of Neural Machine Translation to New Languages"☆39Aug 7, 2018Updated 7 years ago
- [NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…☆142Dec 30, 2021Updated 4 years ago
- pialign - A Phrasal ITG Aligner☆24Apr 29, 2019Updated 6 years ago
- ☆11Nov 5, 2024Updated last year
- Code for "AtTGen: Attribute Tree Generation for Real-World Attribute Joint Extraction", ACL 2023☆13May 19, 2023Updated 2 years ago
- JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning☆10Nov 3, 2024Updated last year
- Code for Generalized Entropy Regularization paper☆14May 2, 2020Updated 5 years ago
- Model for processing text sequences with coreference annotations☆14Nov 29, 2018Updated 7 years ago
- Code for replicating the work in "Targeted Syntactic Evaluation of Language Models." EMNLP 2018.☆43Apr 25, 2020Updated 5 years ago
- ☆10Nov 6, 2020Updated 5 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- lanmt ebm☆12Jun 19, 2020Updated 5 years ago
- The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…☆48Jul 1, 2021Updated 4 years ago
- Code for reproducing the results in "How Well do Sparse Imagenet Models Transfer?", presented at CVPR 2022☆10Jun 3, 2022Updated 3 years ago
- ☆21Dec 5, 2022Updated 3 years ago
- Source code and data for the paper "Towards String-to-Tree Neural Machine Translation"☆16Dec 31, 2017Updated 8 years ago
- Smoothing video traffic to make it a friendlier internet neighbor☆14Apr 23, 2024Updated last year
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆18Sep 9, 2022Updated 3 years ago
- Dependency Grammar Induction☆18Feb 11, 2019Updated 7 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Nov 2, 2020Updated 5 years ago
- FaVIQ: Fact Verification from Information-seeking Questions☆43Nov 23, 2022Updated 3 years ago
- Confident Adaptive Transformers☆14Apr 18, 2021Updated 4 years ago
- Pyramidal Recurrent Units (PRUs): A New LSTM Unit☆10Aug 29, 2018Updated 7 years ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- UW-Madison Course Monitor☆10Oct 4, 2019Updated 6 years ago
- Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and Architecture. Training an MDM using GPT with this repo!☆35Jun 23, 2025Updated 8 months ago
- 대부분의 신문사 뉴스를 수집하는 것을 목적으로 하는 크롤러 제작 프로젝트☆10Jul 29, 2019Updated 6 years ago
- Code for the paper "Are Sixteen Heads Really Better than One?"☆175Apr 1, 2020Updated 5 years ago
- A simple, often-used multiprocessor scheduling (load balancing) algorithm is the LPT algorithm (Longest Processing Time) which sorts the …☆11Aug 21, 2018Updated 7 years ago
- Code for reproducing experiments performed for Accoridon☆13Jun 11, 2021Updated 4 years ago
- pytorch implementation for Patient Knowledge Distillation for BERT Model Compression☆204Sep 20, 2019Updated 6 years ago
- Reproducible analyses for the NicheCompass manuscript☆13Jul 3, 2025Updated 8 months ago
- PTX-EMU is a simple emulator for CUDA program.☆38Apr 25, 2025Updated 10 months ago
- Codebase for training the SubCell models☆18Mar 16, 2026Updated last week
- Notch filtering using ofxCv☆10May 17, 2021Updated 4 years ago
- Data and Baselines for AStitchInLanguageModels dataset☆12Oct 31, 2022Updated 3 years ago