fredrickang / LaLaRANDLinks

LaLaRAND: Flexible Layer-by-Layer CPU/GPU Scheduling for Real-Time DNN Tasks

☆15

Alternatives and similar repositories for LaLaRAND

Users that are interested in LaLaRAND are comparing it to the libraries listed below

Sorting:

Soroosh129 / NeuOS
Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"
☆22Updated 4 years ago
casys-kaist / LLMServingSim
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
☆160Updated 4 months ago
Raphael-Hao / Abacus
☆38Updated 5 months ago
AIS-SNU / PID-Comm
☆27Updated last year
Sys-KU / DeepPlan
[ACM EuroSys 2023] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access
☆57Updated 3 months ago
VIA-Research / vTrain
☆73Updated 6 months ago
PrincetonUniversity / LLMCompass
☆209Updated last month
casys-kaist / glet
☆53Updated 11 months ago
Kyrie-Zhao / awesome-real-time-AI
This is a list of awesome edgeAI inference related papers.
☆98Updated last year
platformxlab / G10
☆40Updated 2 years ago
AIS-SNU / GraNNDis_Artifact
[PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…
☆10Updated last year
mrsnu / band
Multi-DNN Inference Engine for Heterogeneous Mobile Processors
☆35Updated last year
UMass-LIDS / Proteus
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆12Updated last year
casys-kaist / HUVM
☆24Updated 3 years ago
casys-kaist / NeuPIMs
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
☆100Updated last year
casys-kaist / casys-kaist.github.io
☆18Updated 3 weeks ago
upmem / upmem_llm_framework
UPMEM LLM Framework allows profiling PyTorch layers and functions and simulate those layers/functions with a given hardware profile.
☆36Updated 3 months ago
yonsei-hpcp / pid-join
☆11Updated 6 months ago
leesou / PIM-DL-ASPLOS
PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization
☆33Updated last year
msr-fiddle / synergy
☆51Updated 2 years ago
sjtu-epcc / DVABatch
☆21Updated 3 years ago
pku-liang / MAGIS
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆55Updated last year
jashwantraj92 / cocktail
☆15Updated last year
ranggihwang / Pregated_MoE
☆57Updated last year
AIS-SNU / Smart-Infinity
[HPCA'24] Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
☆49Updated 4 months ago
zyqCSL / DiffKV
☆30Updated last month
stanford-mast / INFaaS
Model-less Inference Serving
☆91Updated 2 years ago
casys-kaist / EnvPipe
☆25Updated 2 years ago
marsupialtail / gpu-sparsert
☆18Updated 5 years ago
Yufeng98 / CENT
Artifact for paper "PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference", ASPLOS 2025
☆105Updated 6 months ago