thu-ml / Adaptive-Sparse-TrainerLinks
Official implementation for "Pruning Large Language Models with Semi-Structural Adaptive Sparse Training" (AAAI 2025)
☆12Updated 2 weeks ago
Alternatives and similar repositories for Adaptive-Sparse-Trainer
Users that are interested in Adaptive-Sparse-Trainer are comparing it to the libraries listed below
Sorting:
- ☆11Updated 9 months ago
- ☆13Updated 7 months ago
- Code Repository for the NeurIPS 2024 Paper "Toward Efficient Inference for Mixture of Experts".☆19Updated 8 months ago
- KV cache compression via sparse coding☆11Updated 2 months ago
- ☆14Updated 7 months ago
- Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"☆13Updated 4 months ago
- A code sample demonstrating how to share and rebuild a PyTorch GPU tensor via its pointer/reference between different processes.☆12Updated 10 months ago
- Generic library for neural collapse and several derivative works on the phenomenon.☆12Updated 3 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆11Updated 4 months ago
- This is the official pytorch implementation for paper: Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration☆15Updated 4 months ago
- ☆11Updated 6 months ago
- ☆24Updated 2 months ago
- ChatCoach is a fitness correction system based on pose estimation and large language models (LLMs). The primary goal is to provide fitnes…☆8Updated 8 months ago
- Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents☆13Updated 7 months ago
- 面向多平台编译优化的深度学习中间表示☆10Updated 8 months ago
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"☆146Updated last month
- Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"☆36Updated 3 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆14Updated 3 months ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…☆68Updated 3 months ago
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆164Updated 9 months ago
- A sparse attention kernel supporting mix sparse patterns☆256Updated 5 months ago
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆305Updated last week
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs☆137Updated this week
- ☆16Updated 6 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆142Updated last month
- 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models (WWW 2025)☆10Updated 3 months ago
- This is web api for book site☆9Updated 8 months ago
- In this repository, we delve into the basic concepts of python from scratch. We explored every thing from a beginner perscpective who sta…☆8Updated 9 months ago
- Chrome extension designed to calculate the weighted average of grades from your university grade summary page.☆9Updated this week
- ☆6Updated 8 months ago