aitsc / GLMKD

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
☆31Updated last year

Related projects ⓘ

Alternatives and complementary repositories for GLMKD