Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)
☆40Aug 28, 2023Updated 2 years ago
Alternatives and similar repositories for task-aware-distillation
Users that are interested in task-aware-distillation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆46Oct 17, 2022Updated 3 years ago
- A curated list of personalized Language model / Large language model (continually updated)☆10Nov 17, 2023Updated 2 years ago
- (CVPR 2024) Uniformity and Variance for Heterogeneous Federated Learning☆12Mar 6, 2024Updated 2 years ago
- Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.☆86Oct 18, 2023Updated 2 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆29Feb 9, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.☆14Jan 23, 2022Updated 4 years ago
- ☆11Mar 23, 2026Updated 3 months ago
- Partially Non-Autoregressive Image Captioning☆10Sep 30, 2021Updated 4 years ago
- [NeurIPS 2024] Search for Efficient LLMs☆16Jan 16, 2025Updated last year
- Official repository for "LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation" (TOMM 2023)…☆11Mar 21, 2023Updated 3 years ago
- This resposity maintains a collection of important papers on knowledge distillation (awesome-knowledge-distillation)).☆85Mar 19, 2025Updated last year
- [EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models☆210Feb 11, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆65Oct 17, 2023Updated 2 years ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆267Mar 13, 2025Updated last year
- ☆12Oct 20, 2023Updated 2 years ago
- This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicit…☆1,293Mar 9, 2025Updated last year
- AgentRE-Bench is an agentic benchmark that evaluates state-of-the-art models on long-horizon reverse engineering tasks, measuring their a…☆70Updated this week
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆63Mar 21, 2026Updated 3 months ago
- A collection for math word problem (MWP) works, including datasets, algorithms and so on.☆47Jun 18, 2024Updated 2 years ago
- Map4RDF allows visualising and interacting with Linked Geospatial Data available in any SPARQL endpoint☆10Feb 9, 2020Updated 6 years ago
- A double range slider for svelte☆16Aug 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [Re-implementation] FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence☆15Jun 29, 2020Updated 6 years ago
- ☆13Sep 5, 2023Updated 2 years ago
- Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).☆27Aug 25, 2024Updated last year
- First Latency-Aware Competitive LLM Agent Benchmark☆29Jun 3, 2025Updated last year
- ☆53Dec 31, 2024Updated last year
- ☆11Jul 6, 2023Updated 2 years ago
- ☆22Oct 22, 2024Updated last year
- ☆13Dec 9, 2024Updated last year
- CLIPCleaner: Cleaning Noisy Labels with CLIP (ACM MM2024)☆15Apr 28, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆23Nov 26, 2024Updated last year
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆22Nov 26, 2022Updated 3 years ago
- Exploring and improving the quality of ChatGPT-generated code for LeetCode programming tasks.☆11Jan 19, 2024Updated 2 years ago
- ☆28Mar 5, 2024Updated 2 years ago
- ☆10Dec 28, 2018Updated 7 years ago
- Synthesizing Fingerprint from Pattern Type Analysis Features using cGAN - WITC 2019☆12Apr 19, 2019Updated 7 years ago
- Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring official code☆11Jul 17, 2023Updated 2 years ago