hyungyokim / LIA_AMXGPULinks
[ISCA'25] LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading
☆12Updated 4 months ago
Alternatives and similar repositories for LIA_AMXGPU
Users that are interested in LIA_AMXGPU are comparing it to the libraries listed below
Sorting:
- HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs☆37Updated 11 months ago
- ☆28Updated 2 years ago
- ☆79Updated 4 years ago
- A Cycle-level simulator for M2NDP☆32Updated 2 months ago
- Artifact for paper "PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference", ASPLOS 2025☆101Updated 6 months ago
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆55Updated last year
- PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization☆33Updated last year
- ☆26Updated 2 years ago
- ☆31Updated 5 years ago
- ☆13Updated 4 years ago
- This serves as a repository for reproducibility of the SC21 paper "In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated…☆36Updated 2 years ago
- NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing☆99Updated last year
- The Artifact of NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering☆59Updated last year
- ☆66Updated 4 years ago
- PIM-ML is a benchmark for training machine learning algorithms on the UPMEM architecture, which is the first publicly-available real-worl…☆24Updated 10 months ago
- UPMEM LLM Framework allows profiling PyTorch layers and functions and simulate those layers/functions with a given hardware profile.☆36Updated 3 months ago
- A fast, accurate, and easy-to-integrate memory simulator that model memory system performance with bandwidth--latency curves.☆30Updated 3 weeks ago
- ☆31Updated 4 months ago
- The source code for GPGPUSim+Ramulator simulator. In this version, GPGPUSim uses Ramulator to simulate the DRAM. This simulator is used t…☆58Updated 6 years ago
- ☆202Updated 2 weeks ago
- ☆27Updated last month
- Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support…☆49Updated 7 years ago
- Artifact for "DX100: A Programmable Data Access Accelerator for Indirection (ISCA 2025)" paper☆13Updated this week
- Simulator code of the paper "Dissecting and Modeling the Architecture of Modern GPU Cores"☆40Updated 3 weeks ago
- ☆27Updated 11 months ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆62Updated last year
- Horizontal Fusion☆24Updated 3 years ago
- MultiPIM: A Detailed and Configurable Multi-Stack Processing-In-Memory Simulator☆56Updated 4 years ago
- [PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…☆10Updated last year
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆148Updated 3 months ago