SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)
☆32Nov 28, 2025Updated 3 months ago
Alternatives and similar repositories for slim
Users that are interested in slim are comparing it to the libraries listed below
Sorting:
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆18Apr 16, 2025Updated 10 months ago
- Official implementation for "Pruning Large Language Models with Semi-Structural Adaptive Sparse Training" (AAAI 2025)☆18Jul 1, 2025Updated 8 months ago
- ☆15Nov 7, 2024Updated last year
- ☆40Nov 22, 2025Updated 3 months ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- ☆25Oct 31, 2024Updated last year
- ☆55Jul 7, 2025Updated 7 months ago
- ☆32Nov 11, 2024Updated last year
- xKV: Cross-Layer SVD for KV-Cache Compression☆44Nov 30, 2025Updated 3 months ago
- RL with Experience Replay☆55Jul 27, 2025Updated 7 months ago
- EE 272B - VLSI Design Project☆15Jun 24, 2021Updated 4 years ago
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆38Sep 24, 2024Updated last year
- [NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆79Jul 29, 2025Updated 7 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆93Sep 4, 2024Updated last year
- [ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs☆98Nov 25, 2024Updated last year
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆36Apr 4, 2024Updated last year
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Feb 21, 2025Updated last year
- Implementation of a Systolic Array based sorting engine on an FPGA using Verilog☆11May 11, 2017Updated 8 years ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- [NAACL 24 Oral] LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models☆39Jan 9, 2025Updated last year
- ☆52Nov 5, 2024Updated last year
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆50Apr 9, 2024Updated last year
- Conditional DDPM for characterizing radio sources from dirty images. (autumn 2023)☆11Nov 30, 2023Updated 2 years ago
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated last year
- ☆10Apr 21, 2025Updated 10 months ago
- ☆40Jan 16, 2026Updated last month
- The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈☆16Updated this week
- Efficient single-pass hyperdimensional classifier. Mirror of https://gitlab.com/biaslab/onlinehd☆11Jan 31, 2021Updated 5 years ago
- Official code for "Algorithmic Capabilities of Random Transformers" (NeurIPS 2024)☆16Sep 28, 2024Updated last year
- ☆11Nov 24, 2020Updated 5 years ago
- Code for the ISCAS23 paper "The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks"☆11Apr 20, 2023Updated 2 years ago
- ☆10Dec 3, 2024Updated last year
- Vector Symbolic Architecture library☆11Updated this week
- ☆14May 21, 2024Updated last year
- Official implementation for Text Generation Beyond Discrete Token Sampling☆21Aug 11, 2025Updated 6 months ago
- Working 8x8 systolic array hardware implemented in Xilinx Vivado, operated and controlled in software using Xilinx Vitis☆14Feb 16, 2024Updated 2 years ago
- The official code for "Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation" | [MM2…☆14Dec 7, 2024Updated last year
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"