Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
☆33Aug 4, 2023Updated 2 years ago
Alternatives and similar repositories for GLMKD
Users that are interested in GLMKD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Sep 13, 2022Updated 3 years ago
- ☆13Jan 22, 2025Updated last year
- [ACL 2025] RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios☆26Jul 2, 2025Updated 9 months ago
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is the repository for the paper 'DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models' (EMNLP2024 …☆18Apr 5, 2025Updated last year
- Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).☆26Aug 25, 2024Updated last year
- ☆16May 6, 2021Updated 4 years ago
- Project for SNARE benchmark☆11Jun 5, 2024Updated last year
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.☆132Jun 18, 2023Updated 2 years ago
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆26Oct 15, 2023Updated 2 years ago
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Jul 13, 2022Updated 3 years ago
- ☆16Dec 21, 2023Updated 2 years ago
- GPT Demo with hybrid distributed training☆10Dec 1, 2022Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Do Multilingual Language Models Think Better in English?☆42Aug 3, 2023Updated 2 years ago
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆24Feb 1, 2026Updated 2 months ago
- ☆11Nov 13, 2024Updated last year
- Official MCP Server for newnow, 40+ sources available.☆28Dec 19, 2025Updated 3 months ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆30Mar 28, 2024Updated 2 years ago
- ☆13Jul 15, 2021Updated 4 years ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆33May 19, 2025Updated 10 months ago
- Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation for Pre-trained Language Models"☆41Aug 9, 2022Updated 3 years ago
- ☆17May 17, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆36Mar 25, 2024Updated 2 years ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆43Aug 6, 2024Updated last year
- An experiment to see if chatgpt can improve the output of the stanford alpaca dataset☆12Mar 29, 2023Updated 3 years ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 3 months ago
- Code and data from the paper 'Human Feedback is not Gold Standard'☆20Apr 1, 2026Updated last week
- ☆12Jan 25, 2024Updated 2 years ago
- Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer" https://arxiv.org/pdf/2112.14569.pdf☆20Dec 28, 2021Updated 4 years ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- ☆10Apr 16, 2021Updated 4 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆15Nov 23, 2023Updated 2 years ago
- CEVAE with VampPrior☆11Jul 18, 2018Updated 7 years ago
- Lift-style CSS selector transforms based on Scalate's Scuery☆10Aug 23, 2012Updated 13 years ago
- Code for "Towards Robust k-Nearest-Neighbor Machine Translation" (EMNLP 2022)☆12Oct 18, 2022Updated 3 years ago
- ☆16Jul 12, 2024Updated last year
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- DWIE (Deutsche Welle corpus for Information Extraction) dataset. Introduced in our "DWIE: an entity-centric dataset for multi-task docume…☆52Jul 23, 2023Updated 2 years ago