NUS-HPC-AI-Lab / Helen
The official implementation of "Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization"
☆15Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for Helen
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆38Updated this week
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆31Updated 3 weeks ago
- ☆108Updated 4 months ago
- MagicPIG: LSH Sampling for Efficient LLM Generation☆59Updated 3 weeks ago
- Retrieval with Learned Similarities☆15Updated this week
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".☆41Updated 4 months ago
- This package implements THOR: Transformer with Stochastic Experts.☆61Updated 3 years ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆120Updated this week
- ☆72Updated last year
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆38Updated last month
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View☆29Updated last month
- ☆47Updated last year
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆37Updated 10 months ago
- ☆17Updated 4 months ago
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆74Updated last year
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆104Updated 11 months ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆139Updated 5 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆49Updated last week
- ☆45Updated 6 months ago
- [NeurIPS 2023] Model-enhanced Vector Index☆21Updated 6 months ago
- Repository of LV-Eval Benchmark☆48Updated 2 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆147Updated 5 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆76Updated last month
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆79Updated last year
- The code and data for the paper JiuZhang3.0☆35Updated 5 months ago
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆64Updated 5 months ago
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆38Updated last week
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆41Updated 2 years ago
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆65Updated last year
- Shopping MMLU: A Multi-Task Online Shopping Benchmark for LLMs.☆12Updated 2 weeks ago