HITSZ-Miao-Group / Paper-Reading-ListLinks
β9Updated last year
Alternatives and similar repositories for Paper-Reading-List
Users that are interested in Paper-Reading-List are comparing it to the libraries listed below
Sorting:
- Awesome list for LLM pruning.β232Updated 6 months ago
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β161Updated 8 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β93Updated 6 months ago
- β256Updated 10 months ago
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languagβ¦β184Updated 4 months ago
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.β398Updated 7 months ago
- β24Updated last month
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fittβ¦β61Updated 2 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)β282Updated 2 months ago
- Awesome list for LLM quantizationβ238Updated 2 weeks ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decompositionβ13Updated 2 months ago
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"β29Updated last month
- List of papers related to neural network quantization in recent AI conferences and journals.β653Updated 3 months ago
- A simple and effective LLM pruning approach.β763Updated 10 months ago
- π° Must-read papers and blogs on Speculative Decoding β‘οΈβ811Updated this week
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantizationβ137Updated last month
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichβ¦β1,026Updated 8 months ago
- [ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cacheβ304Updated 5 months ago
- β55Updated 6 months ago
- β201Updated 8 months ago
- An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.β20Updated 2 months ago
- Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"β34Updated 3 months ago
- β18Updated 3 months ago
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".β105Updated 11 months ago
- The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".β372Updated this week
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"β69Updated last year
- Paper list for Efficient Reasoning.β509Updated this week
- β46Updated last year
- Code Repository of Evaluating Quantized Large Language Modelsβ124Updated 9 months ago
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)β355Updated last week