merrymercy/Awesome-Efficient-LLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/merrymercy/Awesome-Efficient-LLM)

merrymercy / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

☆11

Alternatives and similar repositories for Awesome-Efficient-LLM

Users that are interested in Awesome-Efficient-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TroelsMortensen / rmiexamples
View on GitHub
☆11Mar 29, 2020Updated 6 years ago
changwoolee / BLAST
View on GitHub
[NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference
☆18Nov 6, 2024Updated last year
isEmmanuelOlowe / llm-cost-estimator
View on GitHub
Estimating hardware and cloud costs of LLMs and transformer projects
☆22Apr 1, 2026Updated 3 months ago
jakobhartmann / tensor-eqs-mcts
View on GitHub
Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Search
☆15Aug 9, 2024Updated last year
ChandlerGuan / kperfir_artifact
View on GitHub
☆19May 9, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EfficientLLMSys / MuxServe
View on GitHub
☆15Jun 26, 2024Updated 2 years ago
deathwings602 / Unified-IR
View on GitHub
面向多平台编译优化的深度学习中间表示
☆10Oct 28, 2024Updated last year
yichuan-w / raytracer
View on GitHub
raytracer
☆10Jul 18, 2022Updated 4 years ago
xlab-uiuc / reading-system-verification-papers
View on GitHub
A reading group for system verification papers
☆10Sep 28, 2023Updated 2 years ago
iDoka / hdl-secded-producer
View on GitHub
MATLAB/Octave generator of Hamming ECC coding. Output format is Verilog HDL.
☆12Dec 27, 2022Updated 3 years ago
cyd3r / notify-free-gpu
View on GitHub
A telegram bot that sends you a message when the GPU is in use
☆11May 27, 2024Updated 2 years ago
pku-liang / popa
View on GitHub
A unified programming framework for high and portable performance across FPGAs and GPUs
☆11Mar 23, 2025Updated last year
Bruce-Lee-LY / cuda_back2back_hgemm
View on GitHub
Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.
☆13Nov 3, 2023Updated 2 years ago
SVF-tools / ACT
View on GitHub
Abstract Constraint Transformation
☆18Jul 21, 2026Updated last week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
adammikulis / EdgeRag
View on GitHub
EdgeRag is a program that runs large language models and vector databases on your local device
☆15May 29, 2024Updated 2 years ago
Mungeryang / colqwen3
View on GitHub
The code used to train and run inference with the ColQwen3 model. Welcome to follow and star! ⭐️⭐️⭐️ https://huggingface.co/goodman2001/…
☆15Jul 4, 2026Updated 3 weeks ago
spcl / apfp
View on GitHub
FPGA acceleration of arbitrary precision floating point computations.
☆41May 17, 2022Updated 4 years ago
ATNF / askap-benchmarks
View on GitHub
ASKAP Benchmark Packages
☆13Nov 3, 2023Updated 2 years ago
E3SM-Project / HICCUP
View on GitHub
Hindcast Initial Condition Creation Utility/Processor
☆12May 5, 2026Updated 2 months ago
Aneureka / teddy
View on GitHub
🌏 Teddy is a tiny but scalable http server based on Java NIO, inspired by netty.
☆11Dec 26, 2019Updated 6 years ago
monellz / FlashTensor
View on GitHub
☆19Mar 4, 2025Updated last year
CMU-SAFARI / GPGPUSim-Ramulator
View on GitHub
The source code for GPGPUSim+Ramulator simulator. In this version, GPGPUSim uses Ramulator to simulate the DRAM. This simulator is used t…
☆62Sep 30, 2019Updated 6 years ago
afalsmadi / FUSE
View on GitHub
File System in User Space
☆13Oct 31, 2019Updated 6 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
heymesut / SJTU_microe
View on GitHub
A FPGA-based neural network inference accelerator, which won the third place in DAC-SDC
☆28May 11, 2022Updated 4 years ago
YashasSamaga / ConvolutionBuildingBlocks
View on GitHub
GEMM and Winograd based convolutions using CUTLASS
☆28Jul 15, 2020Updated 6 years ago
MaoZiming / papers
View on GitHub
Paper-reading notes for Berkeley OS prelim exam.
☆14Aug 28, 2024Updated last year
VAGOsolutions / sauerkrautlm-colpali
View on GitHub
☆16Mar 1, 2026Updated 4 months ago
shane-kercheval / mcp-client-agent
View on GitHub
CLI that uses DSPy to interact with MCP servers.
☆24Mar 10, 2025Updated last year
kaist-ina / Trinity-AE
View on GitHub
Source code for Trinity(ASPLOS 2026)
☆26Apr 24, 2026Updated 3 months ago
pulp-platform / neureka
View on GitHub
2-8bit weights, 8-bit activations flexible Neural Processing Engine for PULP clusters
☆34May 20, 2026Updated 2 months ago
GATECH-EIC / DepthShrinker
View on GitHub
[ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan …
☆35Jul 12, 2022Updated 4 years ago
alan-hpc / cuda_op_benchmark
View on GitHub
方便扩展的Cuda算子理解和优化框架，仅用在学习使用
☆18Jun 13, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
savasy / TC32
View on GitHub
Text Classification Dataset for Turkish Language
☆10Nov 16, 2021Updated 4 years ago
Scientific-Computing-Lab / STREAMer
View on GitHub
STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth
☆18Aug 21, 2023Updated 2 years ago
apecloud / kubebench
View on GitHub
A Kubernetes operator for running benchmark tests on databases to evaluate their performance.
☆16Jul 22, 2026Updated last week
ShiqiYu / AIEthics
View on GitHub
《人工智能法规、伦理与社会影响》书稿
☆14Aug 28, 2021Updated 4 years ago
daodavid / gravity-simulation
View on GitHub
Python package for calculation and simulation of n-bodies interaction.
☆30Apr 20, 2023Updated 3 years ago
topling / topling-ark
View on GitHub
Topling core libraries in an ark
☆18May 30, 2026Updated last month
Makiras / makiras_dns_refact
View on GitHub
DNS Server, School homework, support DoT / resolve by list
☆10Sep 16, 2020Updated 5 years ago