a curated list of high-quality papers on resource-efficient LLMs ๐ฑ
โ158Mar 15, 2025Updated last year
Alternatives and similar repositories for Awesome-Resource-Efficient-LLM-Papers
Users that are interested in Awesome-Resource-Efficient-LLM-Papers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [TMLR 2024] Efficient Large Language Models: A Surveyโ1,257Jun 23, 2025Updated 9 months ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)โ67Mar 27, 2025Updated last year
- โ38Jan 15, 2021Updated 5 years ago
- Large Language Model (LLM) Systems Paper Listโ1,902Mar 24, 2026Updated 2 weeks ago
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scalingโ12Mar 7, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean โข AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A curated list for Efficient Large Language Modelsโ1,980Jun 17, 2025Updated 9 months ago
- โ42Dec 15, 2022Updated 3 years ago
- SFS: A Smart OS Scheduler for Serverless Function Workloads (SC'22)โ13Dec 15, 2022Updated 3 years ago
- These are papers that I read and reviewed related to NLP, CV, and Deep Learning ๐ You can check paper links and my reviews ๐โ13Jan 3, 2024Updated 2 years ago
- Survey Paper List - Efficient LLM and Foundation Modelsโ264Sep 22, 2024Updated last year
- Customized Inference Engine for Multiverse Modelsโ25Jun 27, 2025Updated 9 months ago
- Summary of some awesome work for optimizing LLM inferenceโ232Feb 14, 2026Updated last month
- (ICLR 2025) BinaryDM: Accurate Weight Binarization for Efficient Diffusion Modelsโ26Oct 4, 2024Updated last year
- Reading notes on Speculative Decoding papersโ29Feb 24, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI โข AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ๐A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.๐โ5,130Updated this week
- โ16Apr 15, 2025Updated 11 months ago
- Awesome LLM compression research papers and tools.โ1,806Feb 23, 2026Updated last month
- mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless inteโฆโ25Nov 28, 2024Updated last year
- Ray tracing part for the optical system design course.โ15Jul 13, 2023Updated 2 years ago
- Awesome list for LLM pruning.โ287Oct 11, 2025Updated 6 months ago
- CORE-V eXtension Interface compliant RISC-V [F|Zfinx] Coprocessorโ14Nov 12, 2025Updated 5 months ago
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillationโ51Aug 24, 2025Updated 7 months ago
- The official implementation of the DAC 2024 paper GQA-LUTโ22Dec 20, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- โ47Jun 7, 2024Updated last year
- โ20Nov 20, 2024Updated last year
- HFODetector is Python package that that is capable of detecting HFOs with STE / MNI / Hilbert detector. Detection speed is increased by uโฆโ13Feb 16, 2025Updated last year
- GitHub Repository for KDD 2022 paper "Saliency-Regularized Deep Multi-Task Learning"โ12Sep 26, 2023Updated 2 years ago
- โ10Apr 3, 2024Updated 2 years ago
- โ27Aug 31, 2023Updated 2 years ago
- โ11Mar 4, 2026Updated last month
- ๐ฎ LLM GPU Calculatorโ21Aug 19, 2023Updated 2 years ago
- code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720โ57Jun 5, 2024Updated last year
- Virtual machines for every use case on DigitalOcean โข AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languagโฆโ205Feb 10, 2025Updated last year
- โ17Dec 9, 2022Updated 3 years ago
- MICRO 2023 Evaluation Artifact for TeAALโ10Oct 26, 2023Updated 2 years ago
- Code Repository of Evaluating Quantized Large Language Modelsโ134Sep 8, 2024Updated last year
- [๐๐๐ญ๐ฎ๐ซ๐ ๐๐จ๐ฆ๐ฆ๐ฎ๐ง๐ข๐๐๐ญ๐ข๐จ๐ง๐ฌ] ๐ค๐ก LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Cโฆโ24Mar 8, 2026Updated last month
- Repository that contains the code for the paper titled, 'Unifying Distillation with Personalization in Federated Learning'.โ13May 31, 2021Updated 4 years ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instancesโ134Feb 22, 2024Updated 2 years ago