Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)
☆68Jun 27, 2025Updated 10 months ago
Alternatives and similar repositories for distillm-2
Users that are interested in distillm-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆16Apr 29, 2025Updated last year
- Official code release of Hilbert Diffusion Model (PyTorch ver.)☆21Aug 17, 2024Updated last year
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆262Mar 13, 2025Updated last year
- SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…☆12Jul 9, 2025Updated 10 months ago
- (NeurIPS 2022) Understanding Cross-Domain Few-Shot Learning Based on Domain Similarity and Few-Shot Difficulty☆34Mar 19, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆21Mar 3, 2026Updated 2 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 10 months ago
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation. NeurIPS 2022.☆34Oct 18, 2022Updated 3 years ago
- Estimators for Information Theoretic Functionals using Influence Functions☆11Apr 17, 2016Updated 10 years ago
- ☆21Jul 3, 2025Updated 10 months ago
- Official PyTorch implementation of SynergyNeRF: "Synergistic Integration of Coordinate Network and Tensorial Feature for Improving NeRFs …☆12Sep 23, 2024Updated last year
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan☆14Oct 18, 2022Updated 3 years ago
- (SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition☆13Oct 22, 2024Updated last year
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆63Mar 21, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- WildVSR☆22Dec 13, 2023Updated 2 years ago
- Awesome LLM papers, news and projects about learning to reason with LLM, OpenAI o1, reasonning techniques, chain-of-thought (COT), Large …☆28Oct 10, 2024Updated last year
- Open-source code and data for ShadowNet(S&P Oakland'23)☆12Mar 11, 2024Updated 2 years ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆77Nov 23, 2024Updated last year
- LSTM GRU with exact backpropagation derivation and implementation☆13Nov 27, 2017Updated 8 years ago
- RL training framework for diffusion and omni-modality models☆127May 16, 2026Updated last week
- This is the official implementation of NNSplitter (ICML'23)☆12Jun 11, 2024Updated last year
- Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…☆21Jun 24, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆22Oct 22, 2024Updated last year
- TSQP: Safeguarding Real-Time Inference for Quantization Neural Networks on Edge Devices (Accepted to S&P 2025)☆17Sep 16, 2025Updated 8 months ago
- ☆19Jan 26, 2025Updated last year
- Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]☆90Sep 13, 2024Updated last year
- Official Implementation of the paper "Jointly Reinforcing Diversity and Quality in Language Model Generations"☆59May 8, 2026Updated 2 weeks ago
- ☆18Oct 22, 2024Updated last year
- Implementation of SayCan, organized as a python project.☆14Sep 7, 2023Updated 2 years ago
- ☆13Sep 25, 2023Updated 2 years ago
- ICML2025: One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework☆15Jun 24, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM☆50Oct 10, 2024Updated last year
- Quickly hashing all subexpressions of a program modulo alpha-renaming☆17Sep 7, 2021Updated 4 years ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆45Aug 6, 2024Updated last year
- Code for "MetaFun: Meta-Learning with Iterative Functional Updates"☆14Aug 27, 2020Updated 5 years ago
- Invariant Feature Regularization for Fair Face Recognition (ICCV'23)☆15Oct 23, 2023Updated 2 years ago
- ☆16Feb 21, 2023Updated 3 years ago
- ☆68Jun 23, 2025Updated 11 months ago