zhisbug / CavsLinks

Cavs: An Efficient Runtime System for Dynamic Neural Networks

☆15

Alternatives and similar repositories for Cavs

Users that are interested in Cavs are comparing it to the libraries listed below

Sorting:

google / iopddl
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆23Updated 5 months ago
parasailteam / coconet
☆83Updated 2 years ago
sjtu-epcc / Tacker
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS
☆31Updated 8 months ago
humuyan / Korch
ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch
☆37Updated 6 months ago
nox-410 / tvm.tl
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆51Updated last year
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆122Updated 3 years ago
TiledTensor / TiledLower
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆14Updated 10 months ago
pku-liang / MAGIS
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆55Updated last year
zhuohan123 / terapipe
☆75Updated 4 years ago
UofT-EcoSystem / DietCode
DietCode Code Release
☆65Updated 3 years ago
awslabs / ratex
☆23Updated last month
UofT-EcoSystem / hfta
Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion
☆32Updated last year
yuyangJin / PerFlow-AI
PerFlow-AI is a programmable performance analysis, modeling, prediction tool for AI system.
☆24Updated last week
awslabs / lorien
☆42Updated 2 years ago
msr-fiddle / harmony
☆17Updated 2 years ago
Raphael-Hao / brainstorm
Compiler for Dynamic Neural Networks
☆46Updated last year
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
apuaaChen / EVT_AE
Artifacts of EVT ASPLOS'24
☆26Updated last year
SJTU-IPADS / disb
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆54Updated last year
tlc-pack / tenset
☆92Updated 2 years ago
sjtu-epcc / DVABatch
☆21Updated 3 years ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
GVProf / GVProf
GVProf: A Value Profiler for GPU-based Clusters
☆52Updated last year
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆118Updated last year
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
eniac / paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
☆62Updated last year
ceruleangu / Block-Sparse-Benchmark
Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.
☆23Updated 5 years ago
xiezhq-hermann / graphiler
Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…
☆59Updated 3 years ago
casys-kaist / HUVM
☆24Updated 3 years ago
apache / tvm-ffi
TVM FFI
☆67Updated last week