casys-kaist/NeuPIMs

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/casys-kaist/NeuPIMs)

casys-kaist / NeuPIMs

NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing

☆124

Alternatives and similar repositories for NeuPIMs

Users that are interested in NeuPIMs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

scale-snu / attacc_simulator
View on GitHub
☆163Jun 24, 2024Updated 2 years ago
PSAL-POSTECH / ONNXim
View on GitHub
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
☆209Jan 8, 2026Updated 6 months ago
SAITPublic / PIMSimulator
View on GitHub
Processing-In-Memory (PIM) Simulator
☆248Dec 12, 2024Updated last year
arkhadem / aim_simulator
View on GitHub
A simulator for SK hynix AiM PIM architecture based on Ramulator 2.0
☆69Jul 22, 2025Updated last year
7bvcxz / PIMsim
View on GitHub
☆20Jun 1, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
casys-kaist / mNPUsim
View on GitHub
mNPUsim: A Cycle-accurate Multi-core NPU Simulator (IISWC 2023)
☆77Dec 29, 2025Updated 7 months ago
VIA-Research / uPIMulator
View on GitHub
☆186Feb 1, 2025Updated last year
casys-kaist / LLMServingSim
View on GitHub
LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure
☆346Jul 15, 2026Updated 2 weeks ago
leesou / H2-LLM-ISCA-2025
View on GitHub
H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference
☆114Apr 26, 2025Updated last year
CMU-SAFARI / ramulator2
View on GitHub
Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and …
☆606Jul 6, 2026Updated 3 weeks ago
casys-kaist / DaCapo
View on GitHub
☆20Nov 5, 2024Updated last year
upmem / upmem_llm_framework
View on GitHub
UPMEM LLM Framework allows profiling PyTorch layers and functions and simulate those layers/functions with a given hardware profile.
☆43Jul 2, 2026Updated 3 weeks ago
Yufeng98 / CENT
View on GitHub
Artifact for paper "PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference", ASPLOS 2025
☆144May 3, 2025Updated last year
maestro-project / magma
View on GitHub
☆18Jun 17, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
thu-nics / UniNDP
View on GitHub
Github repository of HPCA 2025 paper "UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures"
☆23Jan 18, 2026Updated 6 months ago
umd-memsys / DRAMsim3
View on GitHub
DRAMsim3: a Cycle-accurate, Thermal-Capable DRAM Simulator
☆496Aug 3, 2024Updated last year
PrincetonUniversity / LLMCompass
View on GitHub
☆262Oct 24, 2025Updated 9 months ago
pku-liang / Sanger
View on GitHub
A co-design architecture on sparse attention
☆55Aug 23, 2021Updated 4 years ago
leesou / PIM-DL-ASPLOS
View on GitHub
PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization
☆37Feb 21, 2024Updated 2 years ago
godfather991 / UniNDP
View on GitHub
Artifact material for [HPCA 2025] #2108 "UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures"
☆61Sep 1, 2025Updated 10 months ago
SET-Scheduling-Project / SoMa-HPCA2025
View on GitHub
☆30Feb 27, 2025Updated last year
scalesim-project / SCALE-Sim
View on GitHub
Repository to host and maintain SCALE-Sim code
☆502Jun 28, 2026Updated last month
UVA-LavaLab / PIMeval-PIMbench
View on GitHub
PIMeval simulator and PIMbench suite
☆51Nov 22, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yc2367 / P3-LLM
View on GitHub
☆23Apr 3, 2026Updated 3 months ago
clevercool / ANT-Quantization
View on GitHub
☆123Nov 17, 2023Updated 2 years ago
sjtu-zhao-lab / SALO
View on GitHub
An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences
☆32Mar 7, 2024Updated 2 years ago
yongwonshin / PIMFlow
View on GitHub
☆15Mar 10, 2024Updated 2 years ago
CMU-SAFARI / ramulator-pim
View on GitHub
A fast and flexible simulation infrastructure for exploring general-purpose processing-in-memory (PIM) architectures. Ramulator-PIM combi…
☆198Oct 1, 2022Updated 3 years ago
tsinghua-ideal / spada-sim
View on GitHub
The simulator for SPADA, an SpGEMM accelerator with adaptive dataflow
☆47Jan 26, 2023Updated 3 years ago
AIS-SNU / PID-Comm
View on GitHub
☆28Nov 29, 2024Updated last year
PKUZHOU / GNNear-PACT-2022
View on GitHub
GNNear: Accelerating Full-Batch Training of Graph NeuralNetworks with Near-Memory Processing
☆17Sep 15, 2022Updated 3 years ago
ChaseLab-PKU / InstAttention
View on GitHub
InstAttention: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
☆18Mar 30, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hongsunjang / HILOS
View on GitHub
[ASPLOS'26] HILOS: A Cost-Effective Near-Storage Processing Solution for Offline Inference of Long-Context LLMs
☆20Jan 18, 2026Updated 6 months ago
mit-han-lab / spatten
View on GitHub
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
☆137Aug 27, 2024Updated last year
PSAL-POSTECH / M2NDP-public
View on GitHub
A Cycle-level simulator for M2NDP
☆42Aug 14, 2025Updated 11 months ago
Systems-ShiftLab / MultiPIM
View on GitHub
MultiPIM: A Detailed and Configurable Multi-Stack Processing-In-Memory Simulator
☆59Jun 12, 2021Updated 5 years ago
SET-Scheduling-Project / GEMINI-HPCA2024
View on GitHub
Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators
☆116Apr 28, 2025Updated last year
jha-lab / acceltran
View on GitHub
[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers
☆62Nov 22, 2023Updated 2 years ago
wangxy-2000 / pimsim-nn
View on GitHub
☆64Feb 29, 2024Updated 2 years ago