Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)
☆69Jun 27, 2025Updated last year
Alternatives and similar repositories for distillm-2
Users that are interested in distillm-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo is for CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering.☆14Mar 6, 2024Updated 2 years ago
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated last year
- Official code release of Hilbert Diffusion Model (PyTorch ver.)☆21Aug 17, 2024Updated last year
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆267Mar 13, 2025Updated last year
- Self-Contrastive Learning: Single-viewed Supervised Contrastive Framework using Sub-network (AAAI 2023)☆21Oct 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…☆12Jul 9, 2025Updated 11 months ago
- (NeurIPS 2022) Understanding Cross-Domain Few-Shot Learning Based on Domain Similarity and Few-Shot Difficulty☆34Mar 19, 2024Updated 2 years ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 11 months ago
- ☆13Dec 13, 2024Updated last year
- ☆11Jan 2, 2026Updated 6 months ago
- ☆21Jul 3, 2025Updated last year
- ☆16Oct 18, 2024Updated last year
- ☆31Jan 16, 2025Updated last year
- Azərbaycan dilində informatika, proqramlaşdırma və kompüter elmləri haqqında açıq və ictimai resurs platforması.☆46May 25, 2026Updated last month
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan☆14Oct 18, 2022Updated 3 years ago
- (AAAI 2021) Split-and-Bridge: Adaptable Class Incremental Learning within a Single Neural Network☆24Feb 3, 2021Updated 5 years ago
- Fast and Efficient MMD-based Fair PCA via Optimization over Stiefel Manifold (AAAI 2022)☆11Sep 27, 2022Updated 3 years ago
- ☆11Apr 19, 2021Updated 5 years ago
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆63Mar 21, 2026Updated 3 months ago
- WildVSR☆22Dec 13, 2023Updated 2 years ago
- Awesome LLM papers, news and projects about learning to reason with LLM, OpenAI o1, reasonning techniques, chain-of-thought (COT), Large …☆28Oct 10, 2024Updated last year
- A simple implementation of reverse mode automatic differentiation in C++ without the use of any libraries.☆12Jul 3, 2018Updated 7 years ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆78Nov 23, 2024Updated last year
- The codes for ECCV'22: Learning to Train a Point Cloud Reconstruction Network without Matching☆10Nov 16, 2022Updated 3 years ago
- Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…☆21Jun 24, 2025Updated last year
- ☆22Oct 22, 2024Updated last year
- Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]☆90Sep 13, 2024Updated last year
- Some microbenchmarks and design docs before commencement☆11Feb 1, 2021Updated 5 years ago
- ☆40Jan 23, 2024Updated 2 years ago
- ICML2025: One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework☆15Jun 24, 2025Updated last year
- Quickly hashing all subexpressions of a program modulo alpha-renaming☆17Sep 7, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆45Aug 6, 2024Updated last year
- [NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…☆15Oct 18, 2022Updated 3 years ago
- Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.☆29Oct 18, 2024Updated last year
- ☆30Feb 24, 2026Updated 4 months ago
- ☆73Jun 23, 2025Updated last year
- Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)☆75Mar 11, 2025Updated last year
- Research work aimed at addressing the problem of modeling infinite-length context☆49Dec 18, 2025Updated 6 months ago