shkim0116 / KLASSLinks
[NeurIPS 2025 Spotlight] Implementation of "KLASS: KL-Guided Fast Inference in Masked Diffusion Models"
☆20Updated 2 weeks ago
Alternatives and similar repositories for KLASS
Users that are interested in KLASS are comparing it to the libraries listed below
Sorting:
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆42Updated last year
- Official implementation of "OffsetBias: Leveraging Debiased Data for Tuning Evaluators"☆25Updated last year
- Model Stock: All we need is just a few fine-tuned models☆128Updated 4 months ago
- [ICLR 2025] Official PyTorch implementation of "DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation"☆26Updated 5 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Updated last year
- ☆20Updated 5 months ago
- Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)☆142Updated last year
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆74Updated 10 months ago
- Code release for "Generative Modeling of Weights: Generalization or Memorization?"☆17Updated 6 months ago
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Updated last year
- [NeurIPS 2025] Reasoning Models Better Express Their Confidence"☆22Updated last month
- ☆198Updated last week
- dParallel: Learnable Parallel Decoding for dLLMs☆51Updated 2 months ago
- ☆55Updated 6 months ago
- Preference Learning for LLaVA☆58Updated last year
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆86Updated 3 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆51Updated last week
- Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)☆54Updated 6 months ago
- An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFT☆131Updated 9 months ago
- Code for Heima☆58Updated 8 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆108Updated 2 years ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆55Updated last week
- ☆136Updated 9 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆151Updated 5 months ago
- [ICLR 2025 Oral] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition☆16Updated last year
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆96Updated last month
- Code accompanying the paper "Massive Activations in Large Language Models"☆187Updated last year
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆74Updated 6 months ago
- Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆291Updated 2 weeks ago
- ☆68Updated 9 months ago