Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)
☆59Jun 27, 2025Updated 8 months ago
Alternatives and similar repositories for distillm-2
Users that are interested in distillm-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo is for CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering.☆14Mar 6, 2024Updated 2 years ago
- Official code release of Hilbert Diffusion Model (PyTorch ver.)☆21Aug 17, 2024Updated last year
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆255Mar 13, 2025Updated last year
- Self-Contrastive Learning: Single-viewed Supervised Contrastive Framework using Sub-network (AAAI 2023)☆21Oct 28, 2023Updated 2 years ago
- (NeurIPS 2022) Understanding Cross-Domain Few-Shot Learning Based on Domain Similarity and Few-Shot Difficulty☆33Mar 19, 2024Updated 2 years ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 8 months ago
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation. NeurIPS 2022.☆34Oct 18, 2022Updated 3 years ago
- ☆43Updated this week
- Official PyTorch implementation of SynergyNeRF: "Synergistic Integration of Coordinate Network and Tensorial Feature for Improving NeRFs …☆10Sep 23, 2024Updated last year
- ☆31Jan 16, 2025Updated last year
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan☆14Oct 18, 2022Updated 3 years ago
- ☆11Apr 19, 2021Updated 4 years ago
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆61Updated this week
- Open-source code and data for ShadowNet(S&P Oakland'23)☆12Mar 11, 2024Updated 2 years ago
- Awesome LLM papers, news and projects about learning to reason with LLM, OpenAI o1, reasonning techniques, chain-of-thought (COT), Large …☆27Oct 10, 2024Updated last year
- Predict whether income exceeds $50K/yr based on census data.☆10Mar 25, 2021Updated 4 years ago
- Towards Memorization-Free Diffusion Models (CVPR2024) Codebase☆11Jun 2, 2024Updated last year
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆73Nov 23, 2024Updated last year
- LSTM GRU with exact backpropagation derivation and implementation☆13Nov 27, 2017Updated 8 years ago
- Official Implementation of the paper "Jointly Reinforcing Diversity and Quality in Language Model Generations"☆57Dec 26, 2025Updated 2 months ago
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆63Oct 25, 2024Updated last year
- This is the official implementation of NNSplitter (ICML'23)☆12Jun 11, 2024Updated last year
- PyTorch implementation of the article "Generative Adversarial Network for Handwritten Text"☆10Nov 13, 2023Updated 2 years ago
- Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…☆21Jun 24, 2025Updated 8 months ago
- ☆19Jan 26, 2025Updated last year
- Code for AAAI'25 paper: LLM-Powered User Simulator for Recommender System☆24Jan 6, 2025Updated last year
- ☆13Sep 25, 2023Updated 2 years ago
- A symbolic benchmark for verifiable chain-of-thought financial reasoning. Includes executable templates, 58 topics across 12 domains, and…☆26Dec 26, 2025Updated 2 months ago
- ☆18Oct 22, 2024Updated last year
- ☆40Jan 23, 2024Updated 2 years ago
- Towards Meta-Pruning via Optimal Transport, ICLR 2024 (Spotlight)☆18Dec 5, 2024Updated last year
- To appear in the 11th International Conference on Learning Representations (ICLR 2023).☆18Feb 24, 2023Updated 3 years ago
- The Spacetime of Diffusion Models: An Information Geometry Perspective (ICLR 2026 Oral)☆32Feb 21, 2026Updated last month
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆43Aug 6, 2024Updated last year
- Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.☆29Oct 18, 2024Updated last year
- Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)☆73Mar 11, 2025Updated last year
- Research work aimed at addressing the problem of modeling infinite-length context☆48Dec 18, 2025Updated 3 months ago
- This repository implements the paper, Model-Agnostic Meta-Leanring for Fast Adaptation of Deep Networks.☆16Nov 3, 2017Updated 8 years ago