amirgholami / ai_and_memory_wallLinks

AI and Memory Wall

☆220

Alternatives and similar repositories for ai_and_memory_wall

Users that are interested in ai_and_memory_wall are comparing it to the libraries listed below

Sorting:

parasailteam / coconet
☆83Updated 2 years ago
calculon-ai / calculon
☆155Updated last year
awslabs / slapo
A schedule language for large model training
☆151Updated 2 months ago
tlc-pack / tenset
☆93Updated 3 years ago
uwsampl / SparseTIR
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆141Updated 2 years ago
pku-liang / FlexTensor
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
☆180Updated 3 years ago
awslabs / raf
☆145Updated 9 months ago
UofT-EcoSystem / DietCode
DietCode Code Release
☆65Updated 3 years ago
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆147Updated 3 years ago
microsoft / SparTA
☆159Updated last year
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 7 months ago
pku-liang / AMOS
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
☆116Updated 3 years ago
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆119Updated last year
HPDL-Group / Merak
☆80Updated 6 months ago
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆200Updated 3 years ago
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆121Updated 3 years ago
ParCIS / Magicube
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
☆89Updated 2 years ago
masahi / tvm-cutlass-eval
☆41Updated 3 years ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆120Updated last year
apache / tvm-rfcs
A home for the final text of all TVM RFCs.
☆109Updated last year
UDC-GAC / venom
A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
☆54Updated last year
abhibambhaniya / GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
☆96Updated 4 months ago
alibaba / llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆63Updated last year
osayamenja / FlashMoE
Distributed MoE in a Single Kernel [NeurIPS '25]
☆91Updated last month
zartbot / shallowsim
DeepSeek-V3/R1 inference performance simulator
☆168Updated 7 months ago
zhaiyi000 / tlm
☆45Updated last year
microsoft / SuperScaler
An experimental parallel training platform
☆56Updated last year
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆91Updated 2 years ago
AlibabaResearch / flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
☆223Updated 2 years ago
zhuohan123 / terapipe
☆77Updated 4 years ago